发明名称 APPARATUS AND METHOD FOR CLASSIFYING DOCUMENT, AND COMPUTER PROGRAM PRODUCT
摘要 According to an embodiment, a document classification apparatus includes an extraction unit, a clustering unit, a classification unit, and a label assignment unit. The extraction unit is configured to extract feature words from documents. The clustering unit is configured to cluster the feature words into clusters so that a difference between the number of documents each including any one of the feature words belonging to one cluster and the number of documents each including any one of the feature words belonging to another cluster is equal to or less than a predetermined reference value. The classification unit is configured to classify the documents into the clusters so that each document belongs to the cluster to which the feature word included in the each document belongs. The label assignment unit is configured to assign a classification label to each cluster as a word representative of the corresponding feature words.
申请公布号 US2013268535(A1) 申请公布日期 2013.10.10
申请号 US201313845989 申请日期 2013.03.18
申请人 KABUSHIKI KAISHA TOSHIBA;TOSHIBA SOLUTIONS CORPORATION 发明人 INABA MASUMI;MANABE TOSHIHIKO;KOKUBU TOMOHARU;NAKANO WATARU
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址