摘要 |
Conventional publications concerning collections of community specific expressions include collections of technical terms including nouns and compound nouns in technical fields. However, application to new expressions other than nouns is difficult. Even in the field of collection of unknown words and new words, the objective is limited substantially to nouns, and no techniques of collecting new expressions systematically have been proposed. The invention solves the above problem by (a) means for extracting n-gram collocations specific in a predetermined community from a set of documents used in the community, (b) means for selecting a radical which might be a core of specific expressions, (c) means for expanding the selected radical toward the front and back, and (d) means for screening the expanded radicals according to the grammar.
|