发明名称 APPARATUS FOR SEGMENTING CHINESE CHARACTER SEQUENCE TO CHINESE WORD SEQUENCE
摘要 <P>PROBLEM TO BE SOLVED: To provide an apparatus for segmenting a Chinese character sequence to an appropriate word sequence. <P>SOLUTION: The device comprises a Chinese subword list 64 and a statistical probability model 66 of Chinese sequence of IOB tags assigned to subwords. The apparatus further comprises a subword-based IOB tagging module 88 for segmenting a Chinese sentence 80 into a first Chinese word sequence with a maximum likelihood estimation using the subword list 64 and the probability model 66. The multiple-subword words in the first Chinese word sequence are segmented into subwords each being labeled with the IOB tags according to the segmentation. The words in the Chinese subword list 64 are treated as subwords by the subword-based IOB tagging module 88 in segmenting the Chinese sentence 80. <P>COPYRIGHT: (C)2008,JPO&INPIT
申请公布号 JP2008140117(A) 申请公布日期 2008.06.19
申请号 JP20060325457 申请日期 2006.12.01
申请人 NATIONAL INSTITUTE OF INFORMATION & COMMUNICATIONTECHNOLOGY 发明人 CHO ZUIKYO;SUMIDA EIICHIRO
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址