发明名称 METHOD AND DEVICE FOR CLASSIFYING DOCUMENT AND RECORDING MEDIUM WITH PROGRAM RECORDED THEREON
摘要 PROBLEM TO BE SOLVED: To provide a document classifying method and a document classifying device which attach a sentence structure that correctly accompanies a semantic attribute and can perform classification including the intention of a document implementor even to a sentence that does not undergo proofreading and a sentence whose quality is low. SOLUTION: When an inputting means 11 inputs a document, a morpheme analyzing means 12 divides it into morphenes. A corpus learning means 13 learns the characteristic of a semantic attribute on the basis of the frequency of morphological information from a context in which the semantic attribute appears from a stored document which is stored in a corpus 18 and to which the morphological information and the semantic attribute are preliminarily attached. A semantics attaching means 14 attaches the semantic attribute having the characteristic on the basis of the frequency of the most analogous morphological information. A similarity calculating means 15 obtains similarity obtained by considering the semantic attribute of stored document stored in the corpus 18 and the inputted document. A classifying means 16 classifies the inputted document into categories existing in the stored document with the high similarity in the corpus 18.
申请公布号 JP2000339310(A) 申请公布日期 2000.12.08
申请号 JP19990145115 申请日期 1999.05.25
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 HASEGAWA TAKAAKI
分类号 G06F17/27;G06F17/28;G06F17/30 主分类号 G06F17/27
代理机构 代理人
主权项
地址