发明名称 Method for in-loop human validation of disambiguated features
摘要 Methods for providing in-loop validation of disambiguated features are disclosed. The disclosed methods may include disambiguating features in unstructured text that may use co-occurring features derived from both the source document and a large document corpus. The disambiguating systems may include multiple modules, including a linking on-the-fly module for linking the derived features from the source document to the co-occurring features of an existing knowledge base. The system for disambiguating features may allow identifying unique entities from a knowledge base that includes entities with a unique set of co-occurring features, which in turn may allow for increased precision in knowledge discovery and search results, employing advanced analytical methods over a massive corpus, employing a combination of entities, co-occurring entities, topic IDs, and other derived features. The disclosed method may use validation to provide input to the system for disambiguating features.
申请公布号 US9223833(B2) 申请公布日期 2015.12.29
申请号 US201414558237 申请日期 2014.12.02
申请人 QBASE, LLC 发明人 Lightner Scott;Dave Rakesh;Boddhu Sanjay
分类号 G06F17/30;G06F17/22 主分类号 G06F17/30
代理机构 Dentons US LLP 代理人 Sophir Eric L.;Dentons US LLP
主权项 1. A computer-implemented method comprising: in response to receiving, by a search manager computer, a search query from a user device: submitting, by the search manager computer, the search query to a search conductor computer module for processing;receiving, by the search manager computer, search query results from the search conductor computer module, wherein the search query results having one or more records matching one or more fields of the search query, wherein the search query results are based at least in part on the search query;sending, by the search manager computer, the search query results to a disambiguation analytic computer for disambiguating the search query results by determining relatedness among individual record features and topic identifications (topic IDs) associated with each record in the search query results, wherein the disambiguation analytic computer comprises a main memory storing an in-memory database, wherein the disambiguation analytic computer comprises a linking module which links disambiguation data, in real-time, as the disambiguation data is requested by the search manager computer from the disambiguation analytic computer, wherein the in-memory database is coupled to the linking module, wherein a scoring algorithm is used to determine a probability of at least two individual record features being the same;receiving, by the search manager computer, disambiguated search query results from the disambiguation analytic computer;forwarding, by the search manager computer, the disambiguated search query results to the user device for providing an input on the disambiguated search query results; and in response to receiving, by the search manager computer, the input on the disambiguated search query results from the user device: creating, by the search manager computer, a new feature occurrence record in a knowledge base database, wherein the new feature occurrence record including the input, wherein the in-memory database comprises the knowledge base database,storing, by the search manager computer, the new feature occurrence record in the knowledge base database, andadjusting, by the disambiguation analytic computer, one or more parameters of a disambiguation algorithm based on the input from the user device, wherein the disambiguation algorithm involves at least the linking module.
地址 Reston VA US