发明名称 FEATURE GENERATION AND MODEL SELECTION FOR GENERALIZED LINEAR MODELS
摘要 Systems, methods, and other embodiments associated with feature generation and model selection for generalized linear models are described. In one embodiment, a method includes ordering candidate features in a dataset being considered by a streamwise feature selection process according to an inclusion score that reflects a likelihood that a given candidate feature will be included in the GLM. The ordered candidate features are provided to the streamwise feature selection process for acceptance testing. In one embodiment, the method also includes selecting penalty criterion for use in the acceptance testing that is based on characteristics of the dataset.
申请公布号 US2014236965(A1) 申请公布日期 2014.08.21
申请号 US201313772852 申请日期 2013.02.21
申请人 ORACLE INTERNATIONAL CORPORATION 发明人 YARMUS Joseph
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computer-implemented method comprising: identifying a dataset that stores values for a target attribute and input attributes, where the input attributes are under consideration for inclusion in a generalized linear model that predicts a value of the target attribute based on a selection of features, where each feature comprises a combination of one or more of the input attributes; identifying candidate features, where a candidate feature comprises a combination of one or more of the input attributes; computing respective inclusion scores for respective candidate features, based, at least in part on a likelihood that the candidate feature will be selected for inclusion in the generalized linear model; constructing a set of one or more branches of candidate features ordered according to inclusion score; and providing a branch of candidate features, in order of inclusion score, to a streamwise feature selection process configured to construct the generalized linear model by selecting candidate features for inclusion in the generalized linear model.
地址 Redwood Shores CA US