摘要 |
PROBLEM TO BE SOLVED: To provide an explicit and efficient method for determining similarity in classifying documents on the basis of the similarity of content. SOLUTION: The number of common keywords between documents is used as a threshold to inspect and classify group identity of directly or indirectly coupled documents. The documents are classified while changing the common keyword threshold stepwise, to classify a whole document group while suppressing the number of documents in each group and the total number of groups. COPYRIGHT: (C)2011,JPO&INPIT
|