摘要 |
A method and apparatus are disclosed for comparing an input or query file to a set of files to detect similarities between the query file and the set of files, and digitally shredding files that match, to some degree, the query file and doing so from within the comparison feature. Using a comparison program, the query file is compared with each non-query file in a data processing system, ranging from a stand-alone computer to an enterprise computing network. A list of non-query files having some degree of similarity with the query file is compiled and presented to the user via a user interface within the comparison program. Certain or all non-query files can then be deleted by marking the names of those non-query files in the list. The comparison program can be of the type using either clustering or coalescing, or both, known hashing techniques, or other comparison algorithms. |