发明名称 Process for identifying duplicate values in very large data sets
摘要 The present invention is directed to a method of identifying duplicate data elements in large data sets. This involves receiving the data sets. Dividing each data element in the data set into a series of data segments to define data keys. Generating an intermediate value for the each element in the data set using summed values for the data keys. Sorting the data entries using the intermediate values. Sorting the matched intermediate value entries using the data keys. Identifying the duplicate data elements in the data set.
申请公布号 US7590624(B2) 申请公布日期 2009.09.15
申请号 US20050225309 申请日期 2005.09.12
申请人 LSI CORPORATION 发明人 SHIPLEY GERALD L.;CASTANEDA DAVID A.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址