发明名称 Detecting query-specific duplicate documents
摘要 An improved duplicate detection technique that uses query-relevant information to limit the portion(s) of documents to be compared for similarity is described. Before comparing two documents for similarity, the content of these documents may be condensed based on the query. In one embodiment, query-relevant information or text (also referred to as“snippets”) is extracted from the documents and only the extracted snippets, rather than the entire documents, are compared for purposes of determining similarity.
申请公布号 US7779002(B1) 申请公布日期 2010.08.17
申请号 US20030602965 申请日期 2003.06.24
申请人 GOOGLE INC. 发明人 GOMES BENEDICT ANTHONY;SMITH BENJAMIN THOMAS
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址
您可能感兴趣的专利