发明名称 Identifying information related to a particular entity from electronic sources, using dimensional reduction and quantum clustering
摘要 Presented are systems and methods for identifying information about a particular entity including acquiring electronic documents having unstructured text, that are selected based on one or more search terms from a plurality of terms related to the particular entity. Tokenizing the acquired documents to form a data matrix and then calculating a plurality of eigenvectors, using the data matrix and the transpose of the data matrix. The variance is then acquired for determining the amount of intra-clustering between the documents and then the acquired documents are clustered using some of the eigenvectors and the variance.
申请公布号 US9165061(B2) 申请公布日期 2015.10.20
申请号 US201213587562 申请日期 2012.08.16
申请人 REPUTATION.COM 发明人 Fertik Michael Benjamin Selkowe;Scott Tony;Dignan Thomas
分类号 G06K9/54;G06F17/30;G06K9/62 主分类号 G06K9/54
代理机构 Finnegan, Henderson, Farabow, Garrett & Dunner LLP 代理人 Finnegan, Henderson, Farabow, Garrett & Dunner LLP
主权项 1. A system for identifying information about a particular entity, comprising: a collector module configured to provide a request to one or more data sources using one or more search terms to acquire electronic documents having unstructured text, wherein the electronic documents are selected based on the one or more search terms from a plurality of terms related to the particular entity; a tokenizer module configured to tokenize the acquired documents to form a data matrix; a data processing module configured to calculate a plurality of eigenvectors, using the data matrix and the transpose of the data matrix; a quantum clustering module configured to: acquire a variance for determining the amount of intra-clustering between the documents, andcluster the acquired documents using some of the eigenvectors and the variance.
地址 Redwood CA US