发明名称 Drift annealed time series prediction
摘要 Methods, computer program products, and systems are presented. The methods include, for instance: generating a drift annealed time series prediction model based on training data. In one embodiment the generating may include: recording an ensemble of candidate models for at least one predictor variable of the training data responsive to creating the ensemble based on the training data by machine learning. The ensemble includes three candidate models represented by respective prediction function to formulate a potentially predictive relationship between a target variable and predictor variables in the training data. Respective candidate models in the ensemble is manipulated to adjust degrees associated with predictor variables such that respective new models take relative importance of predictor variables into account and the drift annealed time series prediction model based on the new models is produced.
申请公布号 US9542646(B1) 申请公布日期 2017.01.10
申请号 US201615007990 申请日期 2016.01.27
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 Baughman Aaron Keith;Marzorati Mauro;Van der Stockt Stefan Aloysius Gert
分类号 G06N3/12;G06N99/00 主分类号 G06N3/12
代理机构 代理人 McLane, Esq. Christopher K.;Song, Esq. Hye Jin Lucy
主权项 1. A method for generating a drift annealed time series prediction model based on training data, comprising: determining that the drift annealed time series prediction model is formulated for a first cohort of at least one predictor variable according to a first input obtained by a computer; creating and recording an ensemble of candidate models for at least one predictor variable of the training data, in a memory of a computer, the ensemble comprising a first candidate model, a second candidate model, and a third candidate model, wherein the first candidate model is represented by a linear prediction function, the second candidate model is represented by a quadratic prediction function, and the third candidate model is represented by a cubic prediction function, and wherein the linear prediction function is for long-term forecasting, the quadratic prediction function is for mid-term forecasting, and the cubic prediction function is for short-term forecasting, and wherein the training data comprises instances of said at least one predictor variable and a target variable having a predictive relationship with said at least one predictor variable; creating and recording a new ensemble of new models in the memory, the new ensemble comprising a first new model, a second new model, and a third new model, wherein the respective new models result from calculating respective new degrees for each candidate model of the ensemble such that respective new models take relative importance of predictor variables into account, wherein the first new model is created by calculating a first new degree for the first candidate model, the first new degree resulting from⌊1∑N⁢∑i=1N⁢di⁡(N-(ri-1))⌋that is, a mathematical floor for a first sum of di(N−(ri−1)) divided by a second sum of N, wherein di indicates a degree of a polynomial of a prediction function corresponding to i-th predictor variable of the first candidate model, wherein N indicates a total number of predictor variables in the first candidate model, and wherein ri indicates a respective rank of said i-th predictor variable in the first candidate model such that said i-th predictor variable with a smaller value of ri, indicating a rank higher than other predictor variable, weighs more in the first new degree, wherein the second new model is created by calculating a second new degree for the second candidate model, the second new degree resulting from⌊1∑N⁢∑i=1N⁢di⁡(N-(ri-1))⌋that is, a mathematical floor for a second sum of di(N−(ri−1)) divided by a second sum of N, wherein di indicates a degree of a polynomial of a prediction function corresponding to i-th predictor variable of the second candidate model, wherein N indicates a total number of predictor variables in the second candidate model, and wherein ri indicates a respective rank of said i-th predictor variable in the second candidate model such that said i-th predictor variable with a smaller value of ri, indicating a rank higher than other predictor variable, weighs more in the second new degree, and wherein the third new model is created by calculating a third new degree for the third candidate model, the third new degree resulting from⌊1∑N⁢∑i=1N⁢di⁡(N-(ri-1))⌋that is, a mathematical floor for a third sum of di(N−(ri−1)) divided by a third sum of N, wherein di indicates a degree of a polynomial of a prediction function corresponding to i-th predictor variable of the third candidate model, wherein N indicates a total number of predictor variables in the third candidate model, and wherein ri indicates a respective rank of said i-th predictor variable in the third candidate model such that said i-th predictor variable with a smaller value of ri, indicating a rank higher than other predictor variable, weighs more in the third new degree; instantiating the drift annealed time series prediction model with the recorded new ensemble; and sending, to an output device of the computer, the drift annealed time series prediction model, such that the drift annealed time series prediction model is utilized for forecasting accurately in the future without drifts.
地址 Armonk NY US