发明名称 USING NATURAL LANGUAGE PROCESSING (NLP) TO CREATE SUBJECT MATTER SYNONYMS FROM DEFINITIONS
摘要 Methods, apparatus and systems, including computer program products, for creating subject matter synonyms from definitions extracted from a subject matter glossary. Confidence scores, each representing a likelihood that two terms defined in the subject matter glossary are synonyms, are determined by applying natural language processing (e.g., passage term matching, lexical matching, and syntactic matching) to the extracted definitions. A subject matter thesaurus is built based on the confidence scores. In one embodiment, a statement containing a first term is created based on an extracted definition of the first term, a modified statement is created by substituting a second term in the statement in lieu of the first term, a corpus is searched, and a confidence score is determined based on evidence in the corpus that the modified statement is accurate. The first and second terms are marked as synonyms if the confidence score is greater than a threshold.
申请公布号 US2015081276(A1) 申请公布日期 2015.03.19
申请号 US201314026264 申请日期 2013.09.13
申请人 International Business Machines Corporation 发明人 Gerard Scott N.;Megerian Mark G.
分类号 G06F17/28;G06F17/27 主分类号 G06F17/28
代理机构 代理人
主权项 1. A computer-implemented method for creating subject matter synonyms from definitions of terms defined in a subject matter glossary, comprising: extracting from a subject matter glossary definitions of terms defined in the subject matter glossary; determining a plurality of confidence scores by applying natural language processing to the definitions extracted from the subject matter glossary, wherein each confidence score represents a likelihood that two terms defined in the subject matter glossary are synonyms; building a subject matter thesaurus based on the confidence scores.
地址 Armonk NY US