摘要 |
PROBLEM TO BE SOLVED: To prevent classification precision from becoming worse as a group of documents to be classified changes in quality by making a document classifying device which classifies a computerized document into one of previously set categories always maintain a classification system matching with characteristics of the group of the documents to be classified. SOLUTION: This device is equipped with a feature vector space estimation part 202 which estimates feature vector spaces of respective categories (including an unclassifiable category) into which documents are classified, a classification category estimation part 203 which estimates classification destinations of the respective documents by comparing feature vectors of the documents with the feature vector spaces of the respective categories, a category system management part 204 which decides whether the frequency of estimation that an unclassifiable category is estimated as a classification destination exceeds a certain threshold, and a user interface part 205 which recommends an operator to add a new category when the threshold is exceeded. The categories can be divided, deleted, or merged.
|