摘要 |
PROBLEM TO BE SOLVED: To provide a device and a method for document classification which prevent documents from being classified as a user does not intend and can generate initial classification representative feature vectors. SOLUTION: The document classification device has a document input part 101, a document analysis part 102 which analyzes words of document data, a document feature vector generation part 103 which calculates document feature vectors of a document, a classification representative vector generation part 104 which generates a classification representative vector having the same number of dimensions with the document feature vector, a refinement-excluded vector specification part 105 which specifies a classification representative vector not to be refined, a document data allocation part 106 which allocates document data to one of classification representative vectors, a classification refinement part 107 which recalculates a classification representation vector according to the document feature vector allocated to the document data allocation part, and a classification result storage part 108.
|