发明名称 Methods and systems for automatic selection of classification and regression trees
摘要 The present invention provides a method and system for automatically identifying and selecting preferred classification and regression trees. The invention is used to identify a specific decision tree or group of trees that are consistent across train and test samples in node-specific details that are often important to decision makers. Specifically, for a tree to be identified as preferred by this system, the train and test samples must both agree on key measures for every terminal node of the tree. In addition to this node-by-node criterion, an additional tree selection method may be imposed. Accordingly, the train and test samples rank order the nodes on a relevant measure in the same way. Both consistency criteria may be applied in a fuzzy manner in which agreement must be close but need not be exact.
申请公布号 US2015370849(A1) 申请公布日期 2015.12.24
申请号 US200814769453 申请日期 2008.01.04
申请人 Steinberg Dan 发明人 Steinberg Dan
分类号 G06F17/30;G06Q40/08;G06N99/00;G06Q30/02 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computer-implemented method for automatically selecting a preferred tree among a plurality of trees, comprising: accessing a stored data table comprising a plurality of records and a plurality of columns; identifying a training set of records and a test set of records within the data table records; analyzing the training set to obtain a collection of at least one fitted tree, each tree having at least one node, at least one node of each tree being a terminal node; determining the training set records associated with each node of each tree; determining the test set records associated with each node of each tree; determining an agreement statistic and a rank match statistic associated with each node of each tree; analyzing the agreement statistic and the rank match statistic for each node of each tree; and identifying at least one preferred tree based on the step of analyzing.
地址 San Diego CA US