摘要 |
The technology described uses a Naïve Bayes Classifier with Active-Feature Ordering to identify contributors to a contact database who are likely to be able to update an arbitrary contact. The technology disclosed further relates to identifying the n most likely records with a number of features, with each feature having a specific finite number of different possible values. The disclosed technology also describes using a Naïve Bayes Classifier with Active-Feature Ordering for diagnostic screening, to evaluate a patient's symptoms against a compendium of diseases to choose the diseases with the greatest posterior likelihood given the vector of observed symptoms of the patient. The disclosed technology additionally describes using a Naïve Bayes Classifier with Active-Feature Ordering for crowd sourcing tasks, using a sample data set that includes thousands of workers, to identify a worker, who is experienced, to complete a featured task. |
主权项 |
1. A method of classifying objects that include features, including:
initializing a top-n classes classifier using a configuration data set that includes:
sets of unique feature-values,counts or relative likelihoods of the unique feature-values in the training examples and of the classes in the training examples, andordered lists of classes that include the unique feature-values, the initializing further including loading or calculating counts by feature of the unique feature-values and a count of training set elements; and classifying a target object into up to a predetermined number of classes, including:
using selected features common to the target object and the configuration data set, beginning with a first feature that has more unique feature-values than other features;for a first feature-value of the first feature of the target object, evaluating at least the relative likelihood of the first feature-value belonging to at least the predetermined number of classes selected from the ordered list of classes for the first feature-value;for additional feature-values of additional features of the target object, generally processing the additional features in order of decreasing number of unique feature-values per feature, and updating joint relative likelihoods of the target object belonging to classes selected using at least the relative likelihoods of the first and additional features; andoutputting at least the predetermined number of classes for the target object based on the updated relative likelihoods. |