发明名称 |
METHOD AND DEVICE FOR CLASSIFYING DOCUMENT AND RECORDING MEDIUM WITH PROGRAM RECORDED THEREON |
摘要 |
PROBLEM TO BE SOLVED: To provide a document classifying method and a document classifying device which attach a sentence structure that correctly accompanies a semantic attribute and can perform classification including the intention of a document implementor even to a sentence that does not undergo proofreading and a sentence whose quality is low. SOLUTION: When an inputting means 11 inputs a document, a morpheme analyzing means 12 divides it into morphenes. A corpus learning means 13 learns the characteristic of a semantic attribute on the basis of the frequency of morphological information from a context in which the semantic attribute appears from a stored document which is stored in a corpus 18 and to which the morphological information and the semantic attribute are preliminarily attached. A semantics attaching means 14 attaches the semantic attribute having the characteristic on the basis of the frequency of the most analogous morphological information. A similarity calculating means 15 obtains similarity obtained by considering the semantic attribute of stored document stored in the corpus 18 and the inputted document. A classifying means 16 classifies the inputted document into categories existing in the stored document with the high similarity in the corpus 18. |
申请公布号 |
JP2000339310(A) |
申请公布日期 |
2000.12.08 |
申请号 |
JP19990145115 |
申请日期 |
1999.05.25 |
申请人 |
NIPPON TELEGR & TELEPH CORP <NTT> |
发明人 |
HASEGAWA TAKAAKI |
分类号 |
G06F17/27;G06F17/28;G06F17/30 |
主分类号 |
G06F17/27 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|