发明名称 RANKING OF RANDOM BATCHES TO IDENTIFY PREDICTIVE FEATURES
摘要 Methods, media, and systems for selecting features that are predictive of a particular outcome from large sets of potentially-predictive features are disclosed. The feature-selection process involves generating random batches of features and ranking the batches according to how accurately a predictive model based on each batch of features performs. Predictive features are selected according to an aggregate rank of the batches in which they are included.
申请公布号 US2016026917(A1) 申请公布日期 2016.01.28
申请号 US201414341914 申请日期 2014.07.28
申请人 Causalytics, LLC 发明人 Weisberg Herbert I.;Pontes Victor P.
分类号 G06N5/04;G06N7/00 主分类号 G06N5/04
代理机构 代理人
主权项 1. A method comprising: receiving observed data representing a set of outcome values and, for each outcome value, a set of corresponding feature values for a set of potentially-predictive features; selecting a plurality of batches, wherein each batch is a randomly selected subset of features from the set of potentially-predictive features; generating, for each respective batch of the plurality of batches, an accuracy value for a predictive model based on the subset of features associated with the respective batch; ranking the plurality of batches according to the generated accuracy values for each batch; determining, for each respective feature in the set of potentially-predictive features, an aggregate rank for the subset of batches that include the respective feature; and selecting, as predictive features, features from the set of potentially-predictive features for which the determined aggregate rank satisfies a predetermined criterion.
地址 Needham MA US