发明名称 METHOD, DEVICE, AND PROGRAM FOR LANGUAGE MODEL GENERATION AND DEVICE AND PROGRAM FOR TEXT ANALYSIS
摘要 <P>PROBLEM TO BE SOLVED: To estimate symbolic chain probability (language model) by allocating all possible classes to one symbol. <P>SOLUTION: As for individual symbols of a symbolic string read out of a text database 140 having text data stored on a storage medium, a plurality of corresponding classes are found by referring to a symbol-class correspondence table 150 having symbols and a single or a plurality of classes stored on the storage medium, and their class list is generated and stored on the storage medium. Then the appearance frequency of a class chain is counted for all combinations obtained by selecting classes, one by one, from N (an integer of &ge;2) class lists corresponding to N symbols which are adjacent in the read symbol string, and symbolic chain probability as a language model is generated from frequency information on class appearance chains obtained as a result of the counting. <P>COPYRIGHT: (C)2004,JPO
申请公布号 JP2004069858(A) 申请公布日期 2004.03.04
申请号 JP20020226575 申请日期 2002.08.02
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 HORI TAKAAKI;OFU KATSUTOSHI;MATSUNAGA SHOICHI
分类号 G06F17/28;G10L15/06;G10L15/18 主分类号 G06F17/28
代理机构 代理人
主权项
地址
您可能感兴趣的专利