发明名称 AUTOMATIC DOCUMENT SORTING DEVICE
摘要 PURPOSE: To improve such a probability that a contained in the document of a sorting object matched with any one of preserved effective words by holding a lot of words effective for sorting (effective words) as much as possible, and to reduce the processing cost on a vector space for expressing the document by reducing the number of basic words to be the axis of the vector space as much as possible. CONSTITUTION: An effective word extracting part 25 extracts a lot of effective words out of all the document for which plural documents are preserved in a document database 24 by being previously sorted into categories as much as possible and registers them on an effective word dictionary 26. A basic word extracting part 27 extracts a little basic words to be the axis of the vector space for expressing the document out of the effective words registered on the effective word dictionary 26 as much as possible. The information of correlation with the basic words is applied to the respective effective words registered on the effective word dictionary 26. Based on the information of correlation between the effective words and the basic words, a vector expressing part 22 expresses the document inputted as the sorting object as the vector of a little dimensions and an identification deciding part 23 decides the category, to which the document belongs, by performing the calculation of a distance between documents, etc., on the vector space.
申请公布号 JPH08221447(A) 申请公布日期 1996.08.30
申请号 JP19950046564 申请日期 1995.02.10
申请人 CANON INC 发明人 HIROTA MAKOTO;ITO SHIRO;SHIBATA SHOGO;UEDA TAKANARI;IKEDA YUJI;FUJITA MINORU
分类号 G06F17/21;G06F17/22;G06F17/27;G06F17/30 主分类号 G06F17/21
代理机构 代理人
主权项
地址