发明名称 FEATURE COMPLETION IN COMPUTER-HUMAN INTERACTIVE LEARNING
摘要 A collection of data that is extremely large can be difficult to search and/or analyze. Relevance may be dramatically improved by automatically classifying queries and web pages in useful categories, and using these classification scores as relevance features. A thorough approach may require building a large number of classifiers, corresponding to the various types of information, activities, and products. Creation of classifiers and schematizers is provided on large data sets. Exercising the classifiers and schematizers on hundreds of millions of items may expose value that is inherent to the data by adding usable meta-data. Some aspects include active labeling exploration, automatic regularization and cold start, scaling with the number of items and the number of classifiers, active featuring, and segmentation and schematization.
申请公布号 US2016239761(A1) 申请公布日期 2016.08.18
申请号 US201615135266 申请日期 2016.04.21
申请人 MICROSOFT TECHNOLOGY LICENSING, LLC 发明人 SIMARD PATRICE Y.;CHICKERING DAVID MAX;GRANGIER DAVID G.;CHARLES DENIS X.;BOTTOU LEON;SUAREZ CARLOS GARCIA JURADO
分类号 G06N99/00;G06N7/00;G06F17/27 主分类号 G06N99/00
代理机构 代理人
主权项 1. A system for feature completion for machine learning, comprising: one or more processing devices that: store a first set of data items, wherein each data item includes a text stream of words; provide a dictionary, wherein the dictionary includes a list of words that define a concept usable as an input feature for training a machine-learning model to score data items with a probability of being a positive example or a negative example of a particular class of data item; provide a feature that is trained to calculate a first probability of a presence, within a stream of one or more words, of a disjunction of one or more n-grams that correspond semantically to the concept defined by the words in the dictionary; utilize the feature to determine the first probability of the presence, within a stream of one or more words, of a disjunction of one or more n-grams that correspond semantically to the concept defined by the words in the dictionary at a given word position in the data item; provide a machine-learning model that is trainable to calculate a second probability of the presence, within the stream of one or more words at the given word position, of the disjunction of the one or more n-grams that correspond semantically to the concept defined by the words in the dictionary, based on one or more words in the data item not utilized by the feature to determine the first probability; utilize the machine-learning model to determine the second probability of the presence, within the stream of one or more words at the given word position, of a disjunction of one or more n-grams that correspond semantically to the concept defined by the words in the dictionary, based on the one or more words in the data item not utilized by the feature to determine the first probability; determine an actual presence or absence, at the given word position, of the disjunction of the one or more n-grams that correspond semantically to the concept defined by the words in the dictionary; and modify the machine-learning model to adjust the second probability in a positive or negative direction based on the determined actual presence or absence of the disjunction of the one or more n-grams that correspond semantically to the concept defined by the words in the dictionary.
地址 Redmond WA US