摘要 |
Methods, system and computer readable medium for detecting duplicate content in a pair of media files prior to publication on a webpage include generating fingerprints for the contents of each of the pair of media files. The fingerprints of one of the pair of media file are then compared with the fingerprints of another of the pair of media files to compute a similarity score. The similarity score is compared against an established threshold. If the similarity score exceeds the established threshold, it is determined that the two media files are substantial duplicate of one another.
|