摘要 |
Embodiments are directed towards filtering from a user generated content (UGC) search result those documents determined to have insufficient subject matter specificity as defined by a training of a classification filter. The training comprises selecting a set of UGC that is definable as having sufficient subject matter specificity (a good set), and another set of UGC that is definable as having insufficient subject matter specificity (a bad set). The trained UGC classifier may examine search documents, and based on the documents having values above a defined threshold categorize the document as having sufficient subject matter specificity (or not). Those documents having insufficient subject matter specificity based on their determined thresholds may be filtered out of the submitted UGC search results. The documents remaining within the UGC search results may then be provided to a searcher for display at a client device.
|