发明名称 Method of answering questions and scoring answers using structured knowledge mined from a corpus of data
摘要 In a method of answering questions and scoring answers, a title and at least one topical field are identified for a document. A field name and field content associated with the topical field is identified, and a title-oriented document is created by combining the title, the field name, and the field content associated with the topical field. For each title-oriented document, a term in the title is matched to previously established categories to produce a title concept identifier. The topical field is synthesized to produce a field concept identifier and a field content concept identifier. A question is received. The question topic term and the question content identifier are used to identify at least one question-matching relation instance. The title concept identifier of each question-matching relation instance is identified as a candidate answer to the question. Each candidate answer and a corresponding answer score is output.
申请公布号 US9299024(B2) 申请公布日期 2016.03.29
申请号 US201213710708 申请日期 2012.12.11
申请人 International Business Machines Corporation 发明人 Bagchi Sugato;Ferrucci David A.;Levas Anthony T.;Mueller Erik T.
分类号 G06F17/00;G06N5/02;G06F17/30;G06N5/04;G06N99/00 主分类号 G06F17/00
代理机构 Gibb & Riley, LLC 代理人 Gibb & Riley, LLC
主权项 1. A computer system for scoring answers to questions in a question-answering system, comprising: a processor comprising an automated question answering (QA) system comprising: a tangible storage device operatively connected to said processor, said tangible storage device storing a corpus of data comprising natural language documents; anda user interface operatively connected to said processor, said user interface receiving a question into said automated QA system, said processor constructing title-oriented documents from said corpus of data, each title-oriented document comprising a title and a topical field, said topical field comprising a field name and field content associated with said topical field of a document in said corpus of data, said processor creating a relation instance by combining a field identifier for said topical field, a title concept identifier, and a corresponding field content concept identifier, said processor calculating a count for each said relation instance based on a number of occurrences of said title concept identifier and said field content concept identifier within a corresponding document in said corpus of data, said processor analyzing terms in said question, said analyzing identifying a question content identifier based on previously established question term categories, said processor comparing said question content identifier to said relation instance, said comparing identifying a question-matching relation instance, said processor generating an answer to said question by identifying said title concept identifier of each said question-matching relation instance as a candidate answer to said question, and said processor generating a score for said candidate answer by adding each said count within each said relation instance corresponding to said candidate answer.
地址 Armonk NY US