发明名称 INFORMATION-RETRIEVAL SYSTEMS, METHODS, AND SOFTWARE WITH CONCEPT-BASED SEARCHING AND RANKING
摘要 A system has a processor and a memory and further comprises a set of target documents; and means for searching and identifying by the processor one or more of the set of target documents as result documents based on identifying a set of at least one concept associated with a user query. The means for searching and identifying includes means for identifying by the processor a first set of documents based at least in part on a set of word co-occurrence probabilities, with the set of word co-occurrence probabilities derived from at least one corpus of documents related to the set of at least one concept. The means for searching and identifying one or more of the set of target documents further includes means for identifying one or more second documents based on inverse-document-frequency information as result documents and means for ranking the result documents based on the inverse-document-frequency information and the set of word co-occurrence probabilities and means for ranking the result documents based on importance of the set of at least one concept to each of the result documents determined based at least in part on the set of word co-occurrence probabilities. Also disclosed are related methods of processing a query and searching methods. A method and a system for using a query having one or more query terms to identify a set of one or more documents within a database are further disclosed. The system has a processor and a memory and further comprises a code set adapted to identify a set of at least one concept associated with one or more query terms comprising a query; determine for each of one or more documents in the database a score based on a) occurrence of one or more of the query terms in the document; and b) occurrence of one or more non-query terms in the document and display one or more of the documents within a search result based on its determined score. The non-query terms is known to co-occur with one or more of the query terms in a set of documents and being associated with the set of at least one concept.
申请公布号 NZ578672(A) 申请公布日期 2012.08.31
申请号 NZ20070578672 申请日期 2007.12.27
申请人 THOMSON REUTERS GLOBAL RESOURCES 发明人 CUSTIS, TONYA;AL-KOFAHI, KHALID
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址