发明名称 Detecting query-specific duplicate documents
摘要 An improved duplicate detection technique that uses query-relevant information to limit the portion(s) of documents to be compared for similarity is described. Before comparing two documents for similarity, the content of these documents may be condensed based on the query. In one embodiment, query-relevant information or text (also referred to as“snippets”) is extracted from the documents and only the extracted snippets, rather than the entire documents, are compared for purposes of determining similarity.
申请公布号 US8214359(B1) 申请公布日期 2012.07.03
申请号 US20100839164 申请日期 2010.07.19
申请人 GOMES BENEDICT A.;SMITH BENJAMIN T.;GOOGLE INC. 发明人 GOMES BENEDICT A.;SMITH BENJAMIN T.
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址