发明名称 |
BALANCING PROVENANCE AND ACCURACY TRADEOFFS IN DATA MODELING |
摘要 |
Generating a data model may include receiving a raw data set and generating a first repository based on a first set of features of the raw data set, a second repository having a second set of features based on an aggregation of features of the first repository, and a third repository having a third set of features based on the first and second features sets. The data model may be generated based on a tradeoff between accuracy and provenance of the model. |
申请公布号 |
US2015178622(A1) |
申请公布日期 |
2015.06.25 |
申请号 |
US201314133963 |
申请日期 |
2013.12.19 |
申请人 |
International Business Machines Corporation |
发明人 |
Guttmann Christian;Sun Xing Zhi |
分类号 |
G06N5/02;G06Q50/22 |
主分类号 |
G06N5/02 |
代理机构 |
|
代理人 |
|
主权项 |
1. A computer implemented method for generating an analytics model, the method comprising:
receiving a data set having a defined first set of features; defining a second set of features based on an application of a set of domain knowledge data to the first set of features; generating a features hierarchy based on relationships between features of the first and second sets of features; and generating an analytics model based on a selection of features from the features hierarchy, wherein the analytics model includes a highest number of features of the second set of features while maintaining a defined accuracy value. |
地址 |
Armonk NY US |