发明名称 SYSTEM AND METHOD FOR AN EXPERT QUESTION ANSWER SYSTEM FROM A DYNAMIC CORPUS
摘要 Various embodiments provide systems, computer program products and computer implemented methods. Some embodiments include a method of updating an expert corpus set, including obtaining a query from a user, obtaining a raw data source, determining a relevance score for the raw data source with respect to the query, by performing actions including creating a first vector of statistical variables for the query using at least one natural language processing (NLP) socket, the statistical variables having category types, creating a second vector for the first raw data source, having category types that are the same as those for the query and generating a hypothesis regarding the relevance of the raw data source, testing the hypothesis by comparing relative statistical variables, calculating a gradient between the vectors to determine the relevance score and updating the expert corpus set with the raw data in response to the relevance score exceeds a threshold.
申请公布号 US2015193682(A1) 申请公布日期 2015.07.09
申请号 US201414148261 申请日期 2014.01.06
申请人 International Business Machines Corporation 发明人 Baughman Aaron K.;Capps, JR. Louis B.;Graham Barry M.;Mahle Jennifer R.
分类号 G06N5/00;G06N99/00;G06F17/30 主分类号 G06N5/00
代理机构 代理人
主权项 1. A method of updating an expert corpus set, the method comprising: obtaining a query from a first user; obtaining a first raw data source; determining a first relevance score for the first raw data source with respect to the query, by performing actions including: creating a first vector of statistical variables for the query using at least one natural language processing (NLP) socket, the statistical variables of the first vector having category types;creating a second vector of statistical variables for the first raw data source, the statistical variables for the first raw data source having category types that are the same as the category types of the statistical variables for the query; andgenerating a hypothesis regarding the relevance of the first raw data source with respect to the query;testing the hypothesis by comparing each statistical variable for the query to each same statistical variable for the first raw data source;calculating a gradient between the first vector and the second vector to determine the first relevance score; and updating the expert corpus set by ingesting the first raw data source into the expert corpus in response to determining the first relevance score exceeds a first threshold.
地址 Armonk NY US