摘要 |
PROBLEM TO BE SOLVED: To facilitate extraction of information from a document by realizing higher performance of processing regarding natural language processing including clustering of the document. SOLUTION: Documents describing the same topic are grouped in document clustering, so a document group belonging to the same cluster must have some commonality. In addition, each topic has distinctive terms or term pairs. By noticing these points, when the closeness of each document to a noticing cluster is obtained, common information about the noticing cluster is used while the influence of the terms or the term pairs not distinctive to the noticing cluster is excluded. COPYRIGHT: (C)2005,JPO&NCIPI
|