发明名称 WORD EXTRACTION METHOD, DEVICE AND PROGRAM
摘要 <P>PROBLEM TO BE SOLVED: To extract various kinds of words independent of contingency even with respect to low frequency words in a document set of a word extraction object. <P>SOLUTION: A partial character string statistic calculation part 330 reads data related to a partial character string from a work area 600, calculates a statistic, and stores it into the work area 600. A word candidate statistic calculation part 340 reads the statistic of the partial character string and a word candidate from the work area 600, reads a statistic of the partial character string calculated in advance from a document set different from a word extraction object document from an other document statistic DB 700, adds the statistics of the partial character string of both the document sets to calculate a statistic of the word candidate, and stores it into the work area 600. A word candidate selection part 350 reads statistic data of the word candidate from the work area 600, selects the word candidate from the respective word candidates on the basis of the statistic to decide the word, and stores data on the decided word into the work area 600. <P>COPYRIGHT: (C)2004,JPO&NCIPI
申请公布号 JP2004272639(A) 申请公布日期 2004.09.30
申请号 JP20030063209 申请日期 2003.03.10
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 ADACHI TAKAYUKI;YAMADA SETSUO;NAGATA MASAAKI
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项
地址