发明名称 TEXT DATA SIMILARITY CALCULATION METHOD, TEXT DATA SIMILARITY CALCULATION APPARATUS, AND TEXT DATA SIMILARITY CALCULATION PROGRAM
摘要 <P>PROBLEM TO BE SOLVED: To provide a text data similarity calculation method for accurately calculating a similarity between texts. <P>SOLUTION: The text data similarity calculation method comprises: a weighting factor calculation step (S102) for extracting words from a plurality of text data, analyzing modification information between the words (S101), and calculating the weighting factor of each word, based on the number of extracted words; an interword similarity calculation step for generating structured data of the text data (S103), based on the modification information between the word in the text data extracted by a word information extraction step to calculate the similarity between each word of a first structured data generated from one piece of the text data by a structured data generation step and each word of a second structured data generated from one piece of other text data; and a partial structured data similarity calculation step (S105) for calculating the similarity between the first structured data and the second structured data on the basis of the similarity calculated by the interword similarity calculation step and the weighting factor. <P>COPYRIGHT: (C)2006,JPO&NCIPI
申请公布号 JP2006139708(A) 申请公布日期 2006.06.01
申请号 JP20040330939 申请日期 2004.11.15
申请人 RICOH CO LTD 发明人 KENMOCHI EIJI;SATO NAHOKO;SHIMADA ATSUO
分类号 G06F17/30;G06F17/28 主分类号 G06F17/30
代理机构 代理人
主权项
地址