发明名称 INTERACTION DETECTION FOR GENERALIZED LINEAR MODELS
摘要 Provided are techniques for interaction detection for generalized linear models. Basic statistics are calculated for a pair of categorical predictor variables and a target variable from a dataset during a single pass over the dataset. It is determined whether there is a significant interaction effect for the pair of categorical predictor variables on the target variable by: calculating a log-likelihood value for a full generalized linear model without estimating model parameters; calculating the model parameters for a reduced generalized linear model with a recursive marginal mean accumulation technique using the basic statistics; calculating a log-likelihood value for the reduced generalized linear model; calculating a likelihood ratio test statistic using the log-likelihood value for the full generalized linear model and the log-likelihood value for the reduced generalized linear model; calculating a p-value of the likelihood ratio test statistic; and comparing the p-value to a significance level.
申请公布号 US2015006605(A1) 申请公布日期 2015.01.01
申请号 US201414486659 申请日期 2014.09.15
申请人 International Business Machines Corporation 发明人 Chu Yea J.;Han Sier;Shyr Jing-Yun
分类号 G06F7/60 主分类号 G06F7/60
代理机构 代理人
主权项 1. A method, comprising: calculating, using a computer, basic statistics for a pair of categorical predictor variables and a target variable from a dataset during a single pass over the dataset; and determining, using the computer, whether there is a significant interaction effect for the pair of categorical predictor variables on the target variable by: calculating, using the computer, a log-likelihood value for a full generalized linear model without estimating model parameters;calculating, using the computer, the model parameters for a reduced generalized linear model with a recursive marginal mean accumulation technique using the basic statistics;calculating, using the computer, a log-likelihood value for the reduced generalized linear model;calculating, using the computer, a likelihood ratio test statistic using the log-likelihood value for the full generalized linear model and the log-likelihood value for the reduced generalized linear model;calculating, using the computer, a p-value of the likelihood ratio test statistic; andcomparing, using the computer, the p-value to a significance level.
地址 Armonk NY US