发明名称 DOCUMENT SIMILARITY CALCULATION DEVICE
摘要 <P>PROBLEM TO BE SOLVED: To provide a document similarity calculation device capable of lightening the load of processing. <P>SOLUTION: A document similarity calculation device 100 is configured to calculate similarity indicative of how much a plurality of documents are similar to each other. The document similarity calculation device 100 includes a related word group storage part 101 which stores a related word group consisting of mutually related words, a word document frequency matrix generation part 102 which generates a word document frequency matrix as a matrix including as an element a frequency at which a word appears in a document for each combination of the document and word, a word document frequency matrix conversion part 103 which converts the word document frequency matrix on the basis of the stored related word group so as to decrease the number of dimensions of the generated word document frequency matrix, and a similarity calculation part 104 which calculates the similarity on the basis of the word document frequency matrix after the conversion. <P>COPYRIGHT: (C)2013,JPO&INPIT
申请公布号 JP2013008255(A) 申请公布日期 2013.01.10
申请号 JP20110141329 申请日期 2011.06.27
申请人 NEC CORP 发明人 MIURA MITSUGI
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址