发明名称 |
SYSTEM AND METHOD FOR PERFORMING ELECTRONIC INFORMATION RETRIEVAL USING KEYWORDS |
摘要 |
Output documents similar to an input document are identified. A query is formulated using a list of best keywords from the input document to search for a first set of output documents. The list of best keywords is defined with a maximum number of keywords less than the total number of keywords in the list of best keywords that are identified as belonging to a domain specific dictionary of words and as having no measurable linguistic frequency. Lists of keywords are identified for each output document in the first set of documents. A second set of similar documents is determined using a measure of similarity that is computed between keywords identified in the input document and each output document in the first set of documents.
|
申请公布号 |
US2005086205(A1) |
申请公布日期 |
2005.04.21 |
申请号 |
US20030605630 |
申请日期 |
2003.10.15 |
申请人 |
XEROX CORPORATION |
发明人 |
FRANCIOSA ALAIN;DANCE CHRISTOPHER R. |
分类号 |
G06F17/30;(IPC1-7):G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|