发明名称 METHOD OF OPERATING ARTIFICIAL INTELLIGENCE MACHINES TO IMPROVE PREDICTIVE MODEL TRAINING AND PERFORMANCE
摘要 A method of improving the training and performance of predictive models. A first method of operating an artificial intelligence machine produces predictive model language documents describing improved predictive models that generate better business decisions from raw data record inputs. A second method of operating an artificial intelligence machine including processors for predictive model algorithms produces and outputs better business decisions from raw data record inputs. Both methods enrich the raw data records their processors are fed by deleting data fields with data values that have little benefit in decision making, and that derive and add new data fields from information sources then available that do benefit in the decision making of the artificial intelligence machine through improved accuracies of prediction.
申请公布号 US2016071017(A1) 申请公布日期 2016.03.10
申请号 US201514941586 申请日期 2015.11.14
申请人 Brighterion, Inc. 发明人 Adjaoute Akli
分类号 G06N5/04;G06N5/02 主分类号 G06N5/04
代理机构 代理人
主权项 1. A method of operating an artificial intelligence machine to improve their decisions from included predictive models, comprising: deleting with at least one processor a selected data field and any data values contained in the selected data field from each of a first series of data training records stored in a memory of the artificial intelligence machine to exclude each data field in the first series of data training records that has more than a threshold number of random data values, or that has only one repeating data value, or that has too small a Shannon entropy, and using any information gained to select the most useful data fields, and then transforming a surviving number of data fields in all the first series of data training records into a corresponding reduced-field series of data training records stored in the memory of the artificial intelligence machine; adding with the at least one processor a new derivative data field to all the reduced-field series of data training records stored in the memory and initializing each added new derivative data field with a new data value, and including an apparatus for executing an algorithm to either change real scaler numeric data values into fuzzy values, or if symbolic, to change a behavior group data value, and testing that a minimum number of data fields survive, and if not, then to generate a new derivative data field and fix within each an aggregation type, a time range, a filter, a set of aggregation constraints, a set of data fields to aggregate, and a recursive level, and then assessing the quality of a newly derived data field by testing it with a test set of data, and then transforming the results into an enriched-field series of data training records stored in the memory of the artificial intelligence machine; verifying with the at least one processor that each predictive model if trained with the enriched-field series of data training records stored in the memory produces decisions having fewer errors than the same predictive model trained only with the first series of data training records; recording a data-enrichment descriptor into the memory to include an identity of selected data fields in a data training record format of the first series of data training records that were subsequently deleted, and which newly derived data fields were subsequently added, and how each newly derived data field was derived and from which information sources; causing the at least one processor of the artificial intelligence machine to start extracting decisions from a new series of data records of new events by receiving and storing the new series of data records in the memory of the artificial intelligence machine; causing the at least one processor to fetch the data-enrichment descriptor and use it to select which data fields to delete and then deleting all the data values included in the selected data fields from each of a new series of data records of new events; wherein, each data field deleted matches a data field in the first series of data training records had more than a threshold number of random data values, or that had only one repeating data value, or that had too small a Shannon entropy; adding with the at least one processor a new derivative data field to each record of the new series of data records stored in the memory according to the data-enrichment descriptor, and initializing each added new derivative data field with a new data value stored in the memory; wherein, each new derivative data field added matches a new derivative data field added to the enriched-field series of data training records in which real scaler numeric data values were changed into fuzzy values, or if symbolic, were changed into a behavior group data value stored in the memory, and were tested that a minimum number of data fields survive, and if not, then that generated a new derivative data field and fixed within each an aggregation type, a time range, a filter, a set of aggregation constraints, a set of data fields to aggregate, and a recursive level; and producing and outputting a series of predictive decisions with the at least one processor that operates at least one predictive model algorithm derived from one originally built and trained with records having a same record format described by the data-enrichment descriptor and stored in the memory of the artificial intelligence machine.
地址 San Francisco CA US