发明名称 REPRESENTATIVE WORD EXTRACTION DEVICE, REPRESENTATIVE WORD EXTRACTION METHOD, AND REPRESENTATIVE WORD EXTRACTION PROGRAM
摘要 <P>PROBLEM TO BE SOLVED: To extract a word that represents a document group without depending upon the number of documents included in the document group. <P>SOLUTION: A preprocessing part 11 collects document groups including a target document group to be a target to extract a representative word, and a reference word acquiring part 13 acquires a reference word to be reference to extract the representative word. A reference document specifying part 14 specifies a reference document including the reference word from the document groups inputted from the preprocessing part 11, and a word group extracting part 15 extracts the reference word and words other than the reference word as a word group from the reference document. An index calculating part 16 calculates an index whose value increases or decreases in accordance with the magnitude of the co-occurrence frequency with the reference word for each word of the extracted word group. Then, an index correcting part 17 calculates the degree of rarity in the whole document groups and the degree of rarity in a target document group for each word of the extracted word group, and corrects the index calculated by the index calculating part 16 by using the calculated two degrees of rarity. <P>COPYRIGHT: (C)2012,JPO&INPIT
申请公布号 JP2011242975(A) 申请公布日期 2011.12.01
申请号 JP20100114051 申请日期 2010.05.18
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 NAGANO SHOICHI;ICHIKAWA YUSUKE;KOBAYASHI TORU
分类号 G06F17/27;G06F17/30 主分类号 G06F17/27
代理机构 代理人
主权项
地址