发明名称 |
Identifying Optimum Times at which to Retrain a Logistic Regression Model |
摘要 |
An approach is provided in which a knowledge manager trains a machine-learning model and generates a hyperplane based upon a first set of labeled feature vectors. The knowledge manager computes, relative to the hyperplane, a first distribution of a first set of feature vectors corresponding to a first set of source documents. Subsequently, the knowledge manager computes, relative to the hyperplane, a second distribution of a second set of feature vectors corresponding to a second group of source documents. The knowledge manager, in turn, generates an indicator to retrain the machine-learning model in response to determining that a distribution difference between the second distribution and the first distribution reaches a distribution difference threshold. |
申请公布号 |
US2016283861(A1) |
申请公布日期 |
2016.09.29 |
申请号 |
US201514669061 |
申请日期 |
2015.03.26 |
申请人 |
International Business Machines Corporation |
发明人 |
Gerard Scott N. |
分类号 |
G06N99/00 |
主分类号 |
G06N99/00 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method implemented by an information handling system that includes a memory and a processor, the method comprising:
training a machine-learning model utilizing a first set of labeled feature vectors, wherein the training results in a first hyperplane; computing a first distribution of a first set of feature vectors relative to the first hyperplane, wherein the first set of feature vectors correspond to a first group of source documents; computing a second distribution of a second set of feature vectors relative to the first hyperplane, wherein the second set of feature vectors correspond to at least a second group of source documents; and generating an indicator in response to determining that a distribution difference between the second distribution and the first distribution reaches a distribution difference threshold, wherein the machine-learning model is retrained based on the generated indicator. |
地址 |
Armonk NY US |