发明名称 RESIDUAL DATA IDENTIFICATION
摘要 A technique for residual data identification can include receiving a plurality of data instances in a multi-class training data set that are d as belonging to recognized categories, receiving a plurality of data instances a first unlabeled data set, and receiving a plurality of data instances in a second unlabeled data set A technique for residual data identification can include labeling the plurality of data instances in the multi-class training data set as negative data instances. A technique for residual data identification can include labeling the plurality of data instances in the first unlabeled data set as positive data instances. A technique for residual data identification can include training a classifier with the labeled negative data instances and the labeled positive data instances. A technique for residual data identification can include applying the classifier to identify residual data instances in the second unlabeled data set.
申请公布号 US2016267168(A1) 申请公布日期 2016.09.15
申请号 US201315033181 申请日期 2013.12.19
申请人 HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP 发明人 Forman George H.;Keshet Renato
分类号 G06F17/30;G06N99/00 主分类号 G06F17/30
代理机构 代理人
主权项 1. A non-transitory machine-readable medium storing instructions for residual data identification executable by a machine to cause the machine to: receive a plurality of data instances in a multi-class training data set that are labeled as belonging to recognized categories; receive a plurality of data instances in a first unlabeled data set; label the plurality of data instances in the multi-class training data set as negative data instances; label the plurality of data instances in the first unlabeled data set as positive data instances; train a classifier with the labeled negative data instances and the labeled positive data instances; receive a plurality of data instances in a second unlabeled data set; and apply the classifier to identify residual data instances in the second unlabeled data set.
地址 Houston TX US