摘要 |
Methods and apparatus are disclosed for generating a classifier for classifying text. Minimum classification error (MCE) techniques are employed to train generalized linear classifiers for text classification. In particular, minimum classification error training is performed on an initial generalized linear classifier to generate a trained initial classifier. A boosting algorithm, such as the AdaBoost algorithm, is then applied to the trained initial classifier to generate m alternative classifiers, which are then trained using minimum classification error training to generate m trained alternative classifiers. A final classifier is selected from the trained initial classifier and m trained alternative classifiers based on a classification error rate.
|