摘要 |
A method for disambiguating features in unstructured text is provided. The disclosed method may not require pre-existing links to be present. The method for disambiguating features in unstructured text may use co-occurring features derived from both the source document and a large document corpus. The disclosed method may include multiple modules, including a linking module for linking the derived features from the source document to the co-occurring features of an existing knowledge base. The disclosed method for disambiguating features may allow identifying unique entities from a knowledge base that includes entities with a unique set of co-occurring features, which in turn may allow for increased precision in knowledge discovery and search results, employing advanced analytical methods over a massive corpus, employing a combination of entities, co-occurring entities, topic IDs, and other derived features. |