发明名称 |
Method and apparatus for constructing a compact similarity structure and for using the same in analyzing document relevance |
摘要 |
A computer-readable medium comprises data structure for providing information about levels of similarity between pairs of N documents. The data structure comprises a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of the documents. Each of the similarity values represents a level of similarity of one document of a given pair relative to the other document of the given pair. The similarity value of each entry is greater than a threshold similarity value that is greater than zero. The plurality of similarity-value entries are fewer than N2-N in number if the similarity values are asymmetric with regard to document pairing, and the plurality of similarity-value entries are fewer than <maths id="MATH-US-00001" num="00001"> <math overflow="scroll"> <mfrac> <mrow> <msup> <mi>N</mi> <mn>2</mn> </msup> <mo>-</mo> <mi>N</mi> </mrow> <mn>2</mn> </mfrac> </math> </maths> in number if the similarity values are symmetric with regard to document pairing. A method and apparatus for generating the data structure are described.
|
申请公布号 |
US7472131(B2) |
申请公布日期 |
2008.12.30 |
申请号 |
US20050298500 |
申请日期 |
2005.12.12 |
申请人 |
JUSTSYSTEMS EVANS RESEARCH, INC. |
发明人 |
SHANAHAN JAMES G.;ROMA NORBERT;EVANS DAVID A. |
分类号 |
G06F7/00;G06F17/00;G06F17/30 |
主分类号 |
G06F7/00 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|