发明名称 DOCUMENT RETRIEVAL METHOD, SYSTEM, AND PROGRAM
摘要 PROBLEM TO BE SOLVED: To provide a document retrieval method for attaining shortening of registration time and retrieval time of a document while using an N-Gram index system. SOLUTION: Whether a Gram of document data which should be stored in a document data area 37 is an integrated Gram of low frequency or a general Gram of high frequency is judged, post data consisting of a set of a Gram value calculated from character strings of the integrated Gram, a document ID of the document data in which the character strings of the integrated Gram are included, and an offset in the document are stored in an integrated Gram post area 35, post data consisting of a set of a document ID of document data in which character strings of the general Gram are included and the offset in the document are stored in a general Gram post area 36, the post data are read from the integrated Gram post area 35 according to the Gram value calculated for the character strings of the Gram in a retrieval keyword, the post data are read from the general Gram post area 36 according to the Gram of the retrieval keyword, and document data matched to the retrieval keyword are retrieved from the document data area 37 using the read post data. COPYRIGHT: (C)2009,JPO&INPIT
申请公布号 JP2009104669(A) 申请公布日期 2009.05.14
申请号 JP20090029624 申请日期 2009.02.12
申请人 TOSHIBA CORP 发明人 HATTORI MASAKAZU
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址