摘要 |
PURPOSE: A dispersion decision-making tree creation method is provided to offer a machine learning algorithm for constructing a decision tree by utilizing an existing DB(database). CONSTITUTION: Attributes of data acquired from an IP(information protocol) or a wrapper for a data source of distributed DBs, and sufficient statistics of the attributes are calculated(S200). A division standard for creating a decision-making tree is calculated based on the sufficient statistic(S300). A division is repeated based on the division standard through a recursive call(S500). The division is terminated by using a pruning algorithm based on the sufficient statistic(S700). A reference value of a leaf node is calculated based on the sufficient statistic(S800). [Reference numerals] (AA) Yes; (BB) No; (S100) Inserting queue into a route node; (S200) Calculating attributes which is independent from an information provider or a wrapper for each data source, and sufficient statistics of the attributes; (S300) Calculating a division standard for creating a decision-making tree based on sufficient statistics; (S400) Is a node which is able to be divided existed?; (S500) Dividing attributes through a division standard; (S600) Inserting queue into divided nodes; (S700) Terminating division by using a pruning algorithm based on sufficient statistics; (S800) Calculating a reference value of a leaf node based on sufficient statistics |