发明名称 METHODS AND APPARATUSES FOR SEARCHING CONTENT
摘要 Embodiments of methods and apparatuses for searching contents, including structured search for search expressions associated with documents, are described herein. Embodiments may use tree structures (or more generally, graph structures), layout structures, and/or other information to capture within search results relevant content, including sub-document constituents, and/or to improve the accuracy of rankings within search results, and/or to classify documents within hierarchies of documents, and/or to cluster documents. Embodiments may use distance and/or scoring functions to generate scores for the structures to indicate relevance, including usage of local geometry, and linear iteration over portions of the content at a level to capture potential of a portion to influence other portions of the level, and influence received by a portion from the other portions of the level, and influence received by a portion from lower levels. Other embodiments may be described and claimed.
申请公布号 US2016012131(A1) 申请公布日期 2016.01.14
申请号 US201514695478 申请日期 2015.04.24
申请人 Epstein Samuel S. 发明人 Epstein Samuel S.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computer implemented method comprising: receiving, by a computing device, a hierarchy of documents, a set having one or more documents, and a search expression associated with the set of documents, with a member of the set of documents, with a constituent of a member of the set of documents, or with a set of constituents of members of the set of documents; generating, by the computing device, one or more scores for one or more structures of the hierarchy of documents indicative of relative relevance of the documents having the structure to the search expression, wherein the generating of a score for a structure indicative of relative relevance of first one or more of the documents to the search expression is based at least in part on a distance function and a scoring function, wherein the structure has sub-structures structurally describing at least a portion of the hierarchy of documents, and having nodes and/or text strings, wherein the sub-structures are hierarchically organized with the one or more portions of the hierarchy of documents in a sub-structure at a level respectively assigned one or more positions according to a geometry established for that level, wherein the distance function measures distances between sub-structures within the structure, and the scoring function is positionally sensitive, yielding different scores for the structure for different occurrence positions of a given sub-structure; and outputting the one or more scores for one or more structures of the hierarchy of documents indicative of relative relevance of the structures to the document or document constituent or set of documents or set of document constituents associated with the search expression.
地址 Sammamish WA US
您可能感兴趣的专利