发明名称 MULTIPLE IMPUTATION OF MISSING DATA IN MULTI-DIMENSIONAL RETAIL SALES DATA SETS VIA TENSOR FACTORIZATION
摘要 A system, method and computer program product provides for multiple imputation of missing data elements in retail data sets used for modeling and decision-support applications based on the multi-dimensional, tensor structure of the data sets, and a fast, scalable scheme is implemented that is suitable for large data sets. The method generates multiple imputations comprising a set of complete data sets each containing one of a plurality of imputed realizations for the missing data values in the original data set, so that the variability in the magnitudes of these missing data values can be captured for subsequent statistical analysis. The method is based on the multi-dimensional structure of the retail data sets incorporating tensor factorization, that in a preferred embodiment can be implemented using fast, scalable imputation methods suitable for large data sets, to obtain multiple complete data sets in which the original missing values are replaced by various imputed values.
申请公布号 US2013036082(A1) 申请公布日期 2013.02.07
申请号 US201113204237 申请日期 2011.08.05
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION;NATARAJAN RAMESH;BANERJEE ARINDAM;SHAN HANHUAI 发明人 NATARAJAN RAMESH;BANERJEE ARINDAM;SHAN HANHUAI
分类号 G06N5/02 主分类号 G06N5/02
代理机构 代理人
主权项
地址