发明名称 Abstract generating search method and system
摘要 The present disclosure provides an information search method and system applicable in an information search system wherein each document has corresponding forward index data to address the issue of low search efficiency suffered by existing information search techniques. In one aspect, the method may include: receiving an inquiry word and obtaining one or more keywords contained in the inquiry word by segmentation; searching one or more documents matching the one or more keywords and forward index data corresponding to the one or more documents through the information search system's inverted index data; and determining an abstract of each of the one or more documents according to a corresponding document's forward index data, and outputting the abstract and information of the one or more documents as a search result. The proposed techniques can increase efficiency of information search and, at the meantime, guarantee accuracy of the search to a certain extent.
申请公布号 US9367605(B2) 申请公布日期 2016.06.14
申请号 US201012937562 申请日期 2010.08.27
申请人 Alibaba Group Holding Limited 发明人 Luo Yi
分类号 G06F17/30;G06F7/00 主分类号 G06F17/30
代理机构 Lee & Hayes, PLLC 代理人 Lee & Hayes, PLLC
主权项 1. A method comprising: receiving, by a computing device, an inquiry word; segmenting, by the computing device, the inquiry word into one or more keywords; searching, by the computing device, an inverted index of a group of documents to determine in the group one or more documents in which one or more of the keywords are matched; and searching, by the computing device, a forward index of a respective document of the determined one or more documents to generate an abstract for the respective document, the searching including: determining a length limit of the abstract;identifying a plurality of portions within the respective document, each portion of the plurality of portions including a respective beginning position in the respective document and a respective ending position in the respective document, the identifying including identifying, within the respective document, every portion that is within the length limit by traversing the forward index character-by-character or word-by-word;finding a portion among the plurality of portions, the portion including a highest number of the one or more keywords between a beginning position and an ending position compared with any other portion of the plurality of portions; andselecting the found portion to be the abstract of the respective document.
地址 Grand Cayman KY