主权项 |
1. A system to determine a final model to forecast information, the system comprising:
a multidimensional data storage system that includes a data storage that stores information for models, the multidimensional storage system comprising:
a metadata layer that stores, for each model:
a relationship between variables and an objective;a dimension for each of the variables;a plurality of levels for the dimensions of the variables, the plurality of levels defining a hierarchy of levels for each of the dimensions;assumption rules for the variables describing how the variables impact the objective or how the variables impact other variables;aggregation rules for the variables that describe how to aggregate up from a lowest level to higher levels of the dimension, and a transformation to apply for each level;a data layer that stores data for the variables in each model, the data layer comprising data at the lowest level of each dimension; anda multidimensional query layer that receives a request for a multidimensional query and aggregates across different levels of the hierarchy of levels for the variables using the aggregation rules stored in the meta data layer;a model generator executed by a processor that generates a candidate model using the variables and the assumption rules;a model evaluation module executed by the processor to:
determine, for each of the variables in the candidate model, a dimension and level for the variable, and executes by the multidimensional query layer a query to retrieve data for the dimension and the level for each variable by aggregating data for a lowest level of the dimension to the determined level according to the aggregation rules;determine a statistical significance measure to the objective based on the retrieved data for the dimension and the level for each of the variables; anddetermine an indication of relevance for each of the variables in the candidate model indicating a level of impact each of the variables has on the objective, wherein each of the assumption rules specifies a condition, and the model evaluation module is to determine the indication of relevance for each of the variables based on whether the condition in at least one of the assumption rules is satisfied; anddetermine which of the variables in the candidate model to retain based on a comparison of the statistical significance measures to a predetermined relevance threshold; wherein the model generator:
determines modifications to the assumption rules;determines whether the assumption rules include mutually exclusive assumption rules;in response to an identification of the mutually exclusive assumption rules, deletes one of the mutually exclusive assumption rules based on the statistical significance measures of the variables; andgenerates a new candidate model based on at least one of the modifications to the assumption rules, a modification to the variables, the statistical significance measures, and an indication of relevance for each of the variables in the new candidate model,wherein one of the candidate model and the new candidate model is selected as the final model based on a comparison of at least one of the statistical measures and the indication of relevance for the variables in each of the candidate model and the new candidate model. |