摘要 |
PROBLEM TO BE SOLVED: To solve the problem such that input document contents are changed with a lapse of time and a class model may become obsolete, and in such a case, a large amount of workloads are required to update the class model, when executing a document classification system for classifying an input document into predetermined document classes by collating the input document with the class model. SOLUTION: When executing the document classification system, each degree of resemblance between an actual document set being classified into each class and a training document set is obtained for the entire classes, and a class having a low degree of resemblance is selected. Alternatively, each degree of resemblance between the training document set of each class and the actual document set of the other entire classes is obtained, and a class pair having a low degree of resemblance is selected. Thus, a class which has become obsolete is detected. Also, each degree of resemblance between the training document sets is obtained for the entire class pair, and by selecting a class pair having a low degree of resemblance, a class pair having a close topic is detected. COPYRIGHT: (C)2005,JPO&NCIPI
|