摘要 |
<P>PROBLEM TO BE SOLVED: To provide a method of detecting duplicate document content. <P>SOLUTION: A method of detecting duplicate document content in a large document collection comprises: inputting a query document page to a document detection system; comparing the query document page with pages of documents stored in a document collection by using two-dimensional visual fingerprints; and automatically highlighting duplicate or different document content of the query document page and at least one document page of the stored document collection. <P>COPYRIGHT: (C)2012,JPO&INPIT |