摘要 |
PROBLEM TO BE SOLVED: To obtain a word classification result having a well-balanced hierarchical structure by classifying plural words into plural classes in the form of a binary tree having hierachized lower, intermediate and upper layers. SOLUTION: The word classification processing part 20 classifies the words included in the text data stored in a text data memory 10 by assigning the words of comparatively low appearance frequency and the words of high rates to be adjacent to the same word in the same classes respectively. Then, the part 20 classes the word classification result into the intermediate, upper and lower layers. Then, the words are classified in order of intermediate, upper and lower layers and based on the prescribed average mutual information content, i.e., a global (overall) cost function set for all words included in the text data. The classified words are stored in a word dictionary memory 11 in the form of a word dictionary. In such word classification processing, it is possible to obtain the word classification result that has a well-balanced hierarchical structure and also is globally optimized. |