摘要 |
PURPOSE:To shorten the processing time at the time of executing the postprocessing of a character recognition by utilizing the knowledge of a language, by switching the postprocessing method in accordance with whether first candidate character is a KANA (Japanese syllabary) character or a KANJI (Chinese character). CONSTITUTION:A KANA and a KANJI are discriminated, and when the first rank candidate character is a KANA, as the result of character recognition, a KANA character-string part 62 retrieves a KANA n-gram table 9 formed from a learning data in advance and a word dictionary 10, and a KANA character-string to be matched exists, it is outputted to the maximum likelihood KANA candidate character selecting part 63. The maximum likelihood KANA candidate character selecting part 63 selects the maximum likelihood KANA character candidate by utilizing such information as character-string length of a registered KANA character-string which is matched, its appearance frequency, parts of speech, etc. Also, when the first rank candidate character is a KANJI, a KANJI-string testing part 72 segments two characters each from the left of the KANJI-string, gives a priority in order of a KANJI two-character word, two-character prefix and suffix, one-character prefix and suffix, and a KANJI one-character word and searches them in each dictionary 11, 12, 13 and 14.
|