摘要 |
Genetic combinations are vast, and it has been difficult to accurately and efficiently estimate, from all combinations, the gene that is the cause of an event. The present invention provides a data processing device provided with: an acquisition unit for acquiring a plurality of sample data in which the value of each of a plurality of explanatory variables and the occurrence/non-occurrence of an event are correlated; an explanatory variable selection unit for selecting a set of explanatory variables from the plurality of explanatory variables; a learning processing unit for learning a prediction model that predicts, on the basis of the plurality of sample data, the occurrence/non-occurrence of an event from the value of each selected explanatory variable, with respect to each of the multiple sets of explanatory variables; a model selection unit for selecting preferably over others a prediction model the evaluation of which is higher among a plurality of prediction models corresponding to different sets of selected explanatory variables; and a determination unit for determining the set of selected explanatory variables, as a set of cause explanatory variables, that corresponds to the prediction model selected by the model selection unit. |