发明名称 AUTOMATIC DOCUMENT CLASSIFICATION DEVICE, LEARNING DEVICE, CLASSIFICATION DEVICE, AUTOMATIC DOCUMENT CLASSIFICATION METHOD, LEARNING METHOD, CLASSIFICATION METHOD AND STORAGE MEDIUM
摘要 PROBLEM TO BE SOLVED: To provide an automatic document classification device which can appropriately classify a document where other topics different from a subject appear. SOLUTION: The automatic document classification device refers to a valid word dictionary and obtains paragraph vectors on a learning document and a document being a classification object (paragraph vector calculation part 105). An other topic paragraph is decided from the distribution of the paragraph vectors (other topic paragraph decision part 107) and the valid paragraph vector is taken out from the paragraph vectors by referring to the other topic paragraph. A document vector is obtained from the paragraph vector (document vector calculation part 109). In a learning phase, the folder vectors of respective categories are obtained by using the document vector of the learning document (folder vector calculation part 111). In a classification phase, the category to which the document of the classification document belongs is decided in accordance with a comparison result between the document vector of the document being the classification object and the folder vectors of the respective categories (classification decision part 113).
申请公布号 JPH1185796(A) 申请公布日期 1999.03.30
申请号 JP19970250125 申请日期 1997.09.01
申请人 CANON INC 发明人 OTANI NORIKO;ITO SHIRO;SHIBATA SHOGO;UEDA TAKANARI;IKEDA YUJI
分类号 G06F17/21;G06F17/30 主分类号 G06F17/21
代理机构 代理人
主权项
地址