摘要 |
A method for generating a set of one or more elements of a fingerprint for a document, the document comprising a semantic construct having one or more ordered words, the method comprising the steps of: defining a range of sizes for a fingerprint element; dividing the ordered words of the semantic construct into a set of one or more mutually exclusive fingerprint elements, wherein each of the one or more mutually exclusive fingerprint elements includes a number of adjacent words, the number being within the range of sizes for a fingerprint element; and responsive to a determination that the set of mutually exclusive fingerprint elements excludes a word from the semantic construct, discarding the excluded word.
|