发明名称 Web Knowledge Extraction for Search Task Simplification
摘要 Techniques are described for generating structured information from semi-structured web pages, and retrieving the structured knowledge in response to a user query that indicates a query intent. The structured information is automatically extracted offline from semi-structured web pages, through the use of an auto wrapper solution that is noise tolerant, scalable, and automatic. The structured information is stored in a knowledge base, and provided in response to a user search query that indicates a query intent. Extraction of structured information may also include clustering of pages based on their measured similarities. The clusters may be determined based on similar elements in the tag path text data of the pages. A minimum size threshold may be applied to the clusters.
申请公布号 US2013138655(A1) 申请公布日期 2013.05.30
申请号 US201113307836 申请日期 2011.11.30
申请人 YAN JUN;JI LEI;LIU NING;CHEN ZHENG;MICROSOFT CORPORATION 发明人 YAN JUN;JI LEI;LIU NING;CHEN ZHENG
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址