发明名称 Indexing and retrieval of structured documents
摘要 Facilitating the searching of structured documents by identifying multiple element paths corresponding to multiple elements included in multiple structured documents, and for each of the element paths providing, for inclusion in a first searchable data structure, the element path exclusive of a value of the element corresponding to the element path and exclusive of an identifier of the structured document including the element corresponding to the element path, and providing, for inclusion in a second searchable data structure, the element path in association with a value of the element corresponding to the element path and in association with an identifier of the structured document including the element corresponding to the element path.
申请公布号 US9104730(B2) 申请公布日期 2015.08.11
申请号 US201213493836 申请日期 2012.06.11
申请人 International Business Machines Corporation 发明人 Paikowsky Oren;Stark Shimon;Tzaban Yariv
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Edell, Shapiro & Finnan, LLC 代理人 Polimeni Joe;Edell, Shapiro & Finnan, LLC
主权项 1. A system for facilitating the searching of structured documents, the system comprising: a document processor configured to identify a plurality of element paths corresponding to a plurality of elements included in a plurality of structured documents; an element path search preprocessor configured to provide each of the element paths for inclusion in a first searchable data structure, wherein for each of the element paths the element path search preprocessor is configured to provide the element path exclusive of a value of the element corresponding to the element path and exclusive of an identifier of the structured document including the element corresponding to the element path; a document search preprocessor configured to provide each of the element paths for inclusion in a second searchable data structure, wherein for each of the element paths the document search preprocessor is configured to provide the element path in association with a value of the element corresponding to the element path and in association with an identifier of the structured document including the element corresponding to the element path; an element path search engine configured to search the first searchable data structure for a query element path in response to a first query, thereby producing an element path result; and a document search engine configured to determine a second query by combining the element path result with one or more elements of the first query and search the second searchable data structure for the element path result based on the second query, thereby identifying any of the structured documents associated with the element path result.
地址 Armonk NY US