发明名称 Identifying Optimum Times at which to Retrain a Logistic Regression Model
摘要 An approach is provided in which a knowledge manager trains a machine-learning model and generates a hyperplane based upon a first set of labeled feature vectors. The knowledge manager computes, relative to the hyperplane, a first distribution of a first set of feature vectors corresponding to a first set of source documents. Subsequently, the knowledge manager computes, relative to the hyperplane, a second distribution of a second set of feature vectors corresponding to a second group of source documents. The knowledge manager, in turn, generates an indicator to retrain the machine-learning model in response to determining that a distribution difference between the second distribution and the first distribution reaches a distribution difference threshold.
申请公布号 US2016283861(A1) 申请公布日期 2016.09.29
申请号 US201514669061 申请日期 2015.03.26
申请人 International Business Machines Corporation 发明人 Gerard Scott N.
分类号 G06N99/00 主分类号 G06N99/00
代理机构 代理人
主权项 1. A method implemented by an information handling system that includes a memory and a processor, the method comprising: training a machine-learning model utilizing a first set of labeled feature vectors, wherein the training results in a first hyperplane; computing a first distribution of a first set of feature vectors relative to the first hyperplane, wherein the first set of feature vectors correspond to a first group of source documents; computing a second distribution of a second set of feature vectors relative to the first hyperplane, wherein the second set of feature vectors correspond to at least a second group of source documents; and generating an indicator in response to determining that a distribution difference between the second distribution and the first distribution reaches a distribution difference threshold, wherein the machine-learning model is retrained based on the generated indicator.
地址 Armonk NY US