发明名称 Sequence based indexing and retrieval method for text documents
摘要 A sequence based indexing and retrieval method for a collection of text documents includes the steps of generating a query token sequence from a query; generating at least a representative token sequence from each of the documents that contain at least one token of the query token sequence; measuring a similarity between each of the representative token sequences and the query token sequence; and retrieving the text document in responsive to the similarity of the representative query token sequence with respect to the query token sequence. The similarity measurement is preformed by determining a token appearance score, a token order score, and a token consecutiveness score of the representative token sequence with respect to the query token sequence, so as to illustrate the similarity between the representative token sequence and the query token sequence for precisely and effectively retrieving the text document.
申请公布号 US2005210003(A1) 申请公布日期 2005.09.22
申请号 US20040803478 申请日期 2004.03.17
申请人 TSAY YIH-KUEN;YU CHING-LIN;CHEN YU-FANG 发明人 TSAY YIH-KUEN;YU CHING-LIN;CHEN YU-FANG
分类号 G06F7/00;G06F17/30;(IPC1-7):G06F7/00 主分类号 G06F7/00
代理机构 代理人
主权项
地址