Enhancing classification and prediction using predictive modeling,申请号US201514594600-传众专利搜索

发明名称	Enhancing classification and prediction using predictive modeling
摘要	In one embodiment, a system for enhancing predictive modeling includes an interface operable to receive a first dataset. The system may also include a processor communicatively coupled to the interface that is operable to generate a holdout dataset based on the first dataset. The processor may also train each of a plurality of boosting models in parallel using the first dataset, wherein for each of a number of iterations, training comprises: building a one-level binary decision tree to train a split-node variable; calculating an impurity of the split-node variable; and calculating an optimal split node, wherein the optimal split node is the split-node variable with a lowest impurity between the plurality of boosting models. The system may then determine a final model based on one of the plurality of boosting models that provides the lowest error rate when applied to the holdout dataset.
申请公布号	US9171259(B1)	申请公布日期	2015.10.27
申请号	US201514594600	申请日期	2015.01.12
申请人	Bank of America Corporation	发明人	Laxmanan Kasilingam Basker;Chen Yudong;Song Peng
分类号	G06F15/18;G06N5/02;G06N99/00;G06F17/50	主分类号	G06F15/18
代理机构		代理人	Springs Michael A.
主权项	1. A method for enhancing predictive modeling, comprising: receiving, at an interface, a first dataset; generating, with a processor, a holdout dataset based on the first dataset; training, with the processor, each of a plurality of boosting models in parallel using the first data set, wherein for each of a number of iterations, training comprises: building a one-level binary decision tree to train a split-node variable;calculating an impurity of the split-node variable;calculating an optimal split node, wherein the optimal split node is the split-node variable with a lowest impurity between the plurality of boosting models; andcalculating a total variance for the split node variable; determining a final model, based on one of the plurality of boosting models that provides a lowest error rate when applied to the holdout dataset.
地址	Charlotte NC US