发明名称 DECISION TREE LEARNING
摘要 A method of generating a decision tree is provided. A leaf assignment for each proposed split in generating the decision tree is incremented using a Gray code.
申请公布号 US2015012465(A1) 申请公布日期 2015.01.08
申请号 US201414314517 申请日期 2014.06.25
申请人 SAS Institute Inc. 发明人 Pingenot Joseph Albert F.S.
分类号 G06N99/00 主分类号 G06N99/00
代理机构 代理人
主权项 1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to: (a) receive a target indicator of a target variable to evaluate from a dataset; (b) receive an input indicator of an input variable to evaluate from the dataset, wherein the dataset includes a plurality of values of the input variable and a plurality of target values of the target variable, wherein a value of the input variable is associated with a target value of the target variable; (c) determine a number of levels of the input variable to evaluate; (d) define a level value for each of the determined number of levels based on the plurality of values of the input variable; (e) assign the plurality of values of the input variable to a level of the determined number of levels based on the defined level value; (f) identify a maximum number of leaves of a plurality of leaves to evaluate, wherein the identified maximum number of leaves is less than or equal to the determined number of levels; (g) define a leaf assignment value for each level of the determined number of levels, wherein the leaf assignment value is less than or equal to the identified maximum number of leaves; (h) compute a decision metric for splitting data in the dataset based on the defined leaf assignment value and the assigned plurality of values of the input variable; (i) store the defined leaf assignment value and the computed decision metric; (j) increment a leaf assignment of the determined number of levels using a Gray code to define the leaf assignment value for each level of the determined number of levels, wherein the leaf assignment value for each level varies between zero and the identified maximum number of leaves; (k) repeat (h) to (j) for each valid Gray code value; and (l) select a best leaf assignment for the plurality of values of the input variable and the plurality of target values of the target variable based on the computed decision metric.
地址 Cary NC US