Document clustering that applies a locality sensitive hashing function to a feature vector to obtain a limited set of candidate clusters,申请号US20080072179-传众专利搜索

发明名称	Document clustering that applies a locality sensitive hashing function to a feature vector to obtain a limited set of candidate clusters
摘要	Documents from a data stream are clustered by first generating a feature vector for each document. A set of cluster centroids (e.g., feature vectors of their corresponding clusters) are retrieved from a memory based on the feature vector of the document using a locality sensitive hashing function. The centroids may be retrieved by retrieving a set of cluster identifiers from a cluster table, the cluster identifiers each indicative of a respective cluster centroid, and retrieving the cluster centroids corresponding to the retrieved cluster identifiers from a memory. Documents may then be clustered into one or more of the candidate clusters using distance measures from the feature vector of the document to the cluster centroids.
申请公布号	US7797265(B2)	申请公布日期	2010.09.14
申请号	US20080072179	申请日期	2008.02.25
申请人	SIEMENS CORPORATION	发明人	BRINKER KLAUS;MOERCHEN FABIAN;GLOMANN BERNHARD;NEUBAUER CLAUS
分类号	G06N5/00	主分类号	G06N5/00
代理机构		代理人
主权项
地址

您可能感兴趣的专利