摘要 |
<P>PROBLEM TO BE SOLVED: To provide a text analyzing device, method and the like which are intended for European languages (especially, English) and Asian languages (especially, Japanese, Chinese and Korean) and can analyze a plurality of languages using the same system. <P>SOLUTION: When a character code is input into a character code converting part 1, the character code is converted from a local code of the language thereof into Unicode. A word and phrase analyzing part 2 extracts a word and phrase analysis regulation for each language and analyzes an input sentence in terms of word and phrase based on the regulation to create a word candidate. With regard to this word candidate, an analysis engine 5 extracts a statistic language model for each language, refers to a dictionary by a word unigram model included therein to create a morpheme candidate, and subjects the morpheme candidate to analysis processing based on the statistic language model for each language. Finally, a character code conversion part 6 converts the character code from the Unicode into a local code of X language, and then outputs an X language analyzed text. <P>COPYRIGHT: (C)2004,JPO&NCIPI |