摘要 |
<P>PROBLEM TO BE SOLVED: To decide an unnecessary word without using a threshold for deciding whether or not a word is an unnecessary word. Ž<P>SOLUTION: By a parameter learning section 14, appearance probability of each topic of each word in a word group included in learning document data, which maximizes likelihood for the learning document data is learned and searched. By a word classification section 16, each word in the word group is classified by a topic having highest appearance probability. By an unnecessary word decision section 20, a word group of a topic, by which words with appearance probabilities for every topic respectively falling within a predetermined range and distributed uniformly are classified, is decided as an unnecessary word. Ž<P>COPYRIGHT: (C)2010,JPO&INPIT Ž
|