发明名称 TWO-LEVEL N-GRAM INDEX STRUCTURE, METHOD OF BUILDING INDEX, METHOD OF PROCESSING QUERY, AND METHOD OF DERIVING INDEX
摘要 PROBLEM TO BE SOLVED: To build a two-level n-gram inverted index structure that reduces the size of an n-gram inverted index and improves the query performance by eliminating the redundancy of the position information that exists in the n-gram inverted index. SOLUTION: The inverted index comprises a back-end inverted index using subsequences extracted from documents as a term, and a front-end inverted index using n-grams extracted from the subsequences as a term. The back-end inverted index uses the subsequences of a specific length extracted from the documents to be overlapped with each other by n-1 (n: the length of n-gram) as a term, and stores occurrence position information in posting lists for the respective subsequences. The front-end inverted index uses the n-grams of a specific length extracted from the subsequences using a 1-sliding technique as a term, and stores position information of the n-grams occurring in the subsequences in posting lists for the respective n-grams. COPYRIGHT: (C)2007,JPO&INPIT
申请公布号 JP2007080259(A) 申请公布日期 2007.03.29
申请号 JP20060229044 申请日期 2006.08.25
申请人 KOREA ADVANCED INST OF SCIENCE & TECHNOL 发明人 WHANG KYU-YOUNG;KIM MIN-SOO;LEE JAE-GIL;LEE MIN-JAE
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址