发明名称 Method and apparatus for text classification
摘要 A text classification system and method that can be used by an application for classifying natural language text input into a computer system having a domain specific knowledge base that includes a knowledge base having a plurality of categories. The text classification system classifies input natural language input text by first parsing the natural language input text into a first list of recognized keywords. This list is then used to deduce further facts from the natural language input text which are then compiled into a second list. Next, a numeric similarity score for each one of the plurality of categories in the knowledge base is calculated which indicates how similar one of the plurality of categories is to the natural language input text. A dynamic threshold is then applied to determine which ones of the plurality of categories are most similar to the recognized keywords of the natural language input text. A third list is compiled of the ones of the plurality of categories determined to be most similar to the recognized keywords. An optional rule base can be utilized to further refine the determination of which ones of the plurality of categories are most similar to the recognized keywords of the natural language input text. Also, an optional learning capability can be added to improve the accuracy of the text classification system.
申请公布号 US5371807(A) 申请公布日期 1994.12.06
申请号 US19920855378 申请日期 1992.03.20
申请人 DIGITAL EQUIPMENT CORPORATION 发明人 REGISTER, MICHAEL S.;KANNAN, NARASIMHAN
分类号 G06F17/27;G06F17/30;(IPC1-7):G06K9/72;G06F15/38 主分类号 G06F17/27
代理机构 代理人
主权项
地址