发明名称 DETECTING AND EXECUTING DATA RE-INGESTION TO IMPROVE ACCURACY IN A NLP SYSTEM
摘要 In some NLP systems, queries are compared to different data sources stored in a corpus to provide an answer to the query. However, the best data sources for answering the query may not currently be contained within the corpus or the data sources in the corpus may contain stale data that provides an inaccurate answer. When receiving a query, the NLP system may evaluate the query to identify a data source that is likely to contain an answer to the query. If the data source is not currently contained within the corpus, the NLP system may ingest the data source. If the data source is already within the corpus, however, the NLP may determine a time-sensitivity value associated with at least some portion of the query. This value may then be used to determine whether the data source should be re-ingested—e.g., the information contained in the corpus is stale.
申请公布号 US2014280253(A1) 申请公布日期 2014.09.18
申请号 US201313804876 申请日期 2013.03.14
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 Clark Adam T.;Dubbels Joel C.;Huebert Jeffrey K.;Petri John E.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method comprising: receiving a query for processing by a natural language processing (NLP) system; identifying a data source related to the query by associating one or more elements of the query to the data source; upon determining that the related data source is not in a corpus of the NLP system, ingesting the related data source into the corpus; and upon determining that the related data source is in the corpus of the NLP system: determining a time-sensitivity value associated with the query indicating a degree to which an accurate answer to the query is dependent on a staleness of the related data source, andupon determining that the time-sensitivity value satisfies a staleness threshold, re-ingesting the related data source into the corpus.
地址 Armonk NY US