发明名称 DETECTION/SUPPRESSION METHOD FOR LOOK-ALIKE (DUPLICATION)
摘要 <p><P>PROBLEM TO BE SOLVED: To provide a method for detecting similar objects in a collection of such objects. <P>SOLUTION: The method modifies a previous method in such a way that per-object memory requirements are reduced while false detections are avoided approximately as well as in the previous method. The modification includes (i) combining k samples of features into s supersamples, the value of k being reduced from the corresponding value used in the previous method; (ii) recording each supersample to b bits of precision, the value of b being reduced from the corresponding value used in the previous method; and (iii) requiring l matching supersamples in order to conclude that the two objects are sufficiently similar, the value of l being greater than the corresponding value required in the previous method. One application of the method is in association with a web search engine query service to determine clusters of query results that are look-alike documents (similar documents). <P>COPYRIGHT: (C)2006,JPO&NCIPI</p>
申请公布号 JP2005276205(A) 申请公布日期 2005.10.06
申请号 JP20050080092 申请日期 2005.03.18
申请人 MICROSOFT CORP 发明人 MANASSE MARK S
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址