摘要 |
PROBLEM TO BE SOLVED: To support the automatic classification of a handwritten document.SOLUTION: The document classification device includes: a document input part; an extraction part; a featured value extraction/conversion part; a similarity detection part; a calculation part; and a storage part. The document input part acquires a plurality of documents by inputting stroke information as an input document. The extraction part extracts one or more of graphic information, annotation information and text information from the stroke information. The featured value extraction/conversion part calculates featured values enabling the comparison of inter-document similarities from the extracted information. The similarity detection part sets a plurality of clusters each of which includes representative vectors including the featured values showing the characteristics of the clusters, and calculates to which cluster each of the plurality of documents is belonging. The calculation part calculates one or more featured values characterizing the representative vectors included in each of the representative vectors as a classification rule. The storage part stores the classification rule. |