发明名称 DEVICE AND METHOD FOR JUDGING DOCUMENT SIMILARITY
摘要 PROBLEM TO BE SOLVED: To provide a document similarity judging device which accurately judges the document similarity when judging where or not a new document is similar to a set of base documents. SOLUTION: A base document set holding part 101 extracts parts (range of some document) meeting the purpose of similarity judgment from the respective base documents and an emphasis point holding part 102 holds it as an emphasis point. Then, a base document set feature generation part 103 generates feature information (vector representing the frequency of a word present in the range of the document) on the base document set from the respective said base documents and the emphasis point. Further, a new document feature generating means 105 generates feature information (vector representing the frequency of a word present in the document) on the new document from the new document in a new document holding part 104. Then, the feature information on the said base document set is compared with the feature information on the new document (more than a certain internal product of side two vectors) and a similarity judging means 106 judges.
申请公布号 JPH09282331(A) 申请公布日期 1997.10.31
申请号 JP19960111158 申请日期 1996.04.09
申请人 CANON INC 发明人 ITO SHIRO;OTANI NORIKO;UEDA TAKANARI;IKEDA YUJI
分类号 G06F17/21;G06F17/30 主分类号 G06F17/21
代理机构 代理人
主权项
地址