发明名称 |
RESIDUAL DATA IDENTIFICATION |
摘要 |
A technique for residual data identification can include receiving a plurality of data instances in a multi-class training data set that are d as belonging to recognized categories, receiving a plurality of data instances a first unlabeled data set, and receiving a plurality of data instances in a second unlabeled data set A technique for residual data identification can include labeling the plurality of data instances in the multi-class training data set as negative data instances. A technique for residual data identification can include labeling the plurality of data instances in the first unlabeled data set as positive data instances. A technique for residual data identification can include training a classifier with the labeled negative data instances and the labeled positive data instances. A technique for residual data identification can include applying the classifier to identify residual data instances in the second unlabeled data set. |
申请公布号 |
US2016267168(A1) |
申请公布日期 |
2016.09.15 |
申请号 |
US201315033181 |
申请日期 |
2013.12.19 |
申请人 |
HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP |
发明人 |
Forman George H.;Keshet Renato |
分类号 |
G06F17/30;G06N99/00 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A non-transitory machine-readable medium storing instructions for residual data identification executable by a machine to cause the machine to:
receive a plurality of data instances in a multi-class training data set that are labeled as belonging to recognized categories; receive a plurality of data instances in a first unlabeled data set; label the plurality of data instances in the multi-class training data set as negative data instances; label the plurality of data instances in the first unlabeled data set as positive data instances; train a classifier with the labeled negative data instances and the labeled positive data instances; receive a plurality of data instances in a second unlabeled data set; and apply the classifier to identify residual data instances in the second unlabeled data set. |
地址 |
Houston TX US |