发明名称 SEARCH ENGINE CLASSIFICATION
摘要 Techniques for enabling a search engine to automatically classify the content type of Web documents. In an exemplary embodiment, Web documents may be classified as adult or non-adult, based on whether a document contains adult content. In an aspect, Web documents are mined offline to determine the presence of “adult hubs” to which adult documents are connected. The presence of such adult hubs is a strong indicator that linking Web documents may themselves contain adult content. Computational techniques for quantifying the connection between a candidate document and adult hubs are disclosed. The techniques may be utilized in an Internet search engine platform designed to accept user search queries and deliver highly relevant results.
申请公布号 US2016239572(A1) 申请公布日期 2016.08.18
申请号 US201514622870 申请日期 2015.02.15
申请人 Microsoft Technology Licensing, LLC 发明人 Gutierrez Munoz Alejandro;Whisler Jon;Levi Adam;Golebiewski Michael;Rondel Igor;Moradi Shahab
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A search engine apparatus for responding to a search query, the search engine apparatus comprising: an adult hub detection block configured to calculate a hub score for an adult hub, wherein the adult hub is connected to a plurality of documents; a classifier block comprising a candidate document feature block configured to calculate a document score for the candidate document, the classifier block configured to classify the candidate document as adult based on inputs comprising the document score of the candidate document and the hub score; and a filter block configured to remove any candidate document classified as adult from responses to the search query.
地址 Redmond WA US