发明名称 |
Identifying information related to a particular entity from electronic sources, using dimensional reduction and quantum clustering |
摘要 |
Presented are systems and methods for identifying information about a particular entity including acquiring electronic documents having unstructured text, that are selected based on one or more search terms from a plurality of terms related to the particular entity. Tokenizing the acquired documents to form a data matrix and then calculating a plurality of eigenvectors, using the data matrix and the transpose of the data matrix. The variance is then acquired for determining the amount of intra-clustering between the documents and then the acquired documents are clustered using some of the eigenvectors and the variance. |
申请公布号 |
US9165061(B2) |
申请公布日期 |
2015.10.20 |
申请号 |
US201213587562 |
申请日期 |
2012.08.16 |
申请人 |
REPUTATION.COM |
发明人 |
Fertik Michael Benjamin Selkowe;Scott Tony;Dignan Thomas |
分类号 |
G06K9/54;G06F17/30;G06K9/62 |
主分类号 |
G06K9/54 |
代理机构 |
Finnegan, Henderson, Farabow, Garrett & Dunner LLP |
代理人 |
Finnegan, Henderson, Farabow, Garrett & Dunner LLP |
主权项 |
1. A system for identifying information about a particular entity, comprising:
a collector module configured to provide a request to one or more data sources using one or more search terms to acquire electronic documents having unstructured text, wherein the electronic documents are selected based on the one or more search terms from a plurality of terms related to the particular entity; a tokenizer module configured to tokenize the acquired documents to form a data matrix; a data processing module configured to calculate a plurality of eigenvectors, using the data matrix and the transpose of the data matrix; a quantum clustering module configured to:
acquire a variance for determining the amount of intra-clustering between the documents, andcluster the acquired documents using some of the eigenvectors and the variance. |
地址 |
Redwood CA US |