发明名称 |
METHOD AND APPARATUS FOR FILTERING EMAIL SPAM BASED ON SIMILARITY MEASURES |
摘要 |
A method and system for a character-based document comparison are described. In one embodiment, the method includes dividing a first document into tokens. Each token includes a predefined number of sequential characters from the first document. The method further includes calculating hash values for the tokens and creating, for the first document, a signature including a subset of hash values from the calculated hash values and additional information pertaining to the tokens of the first document. The signature of the first document is subsequently compared with a signature of a second document to determine resemblance between the first document and the second document. |
申请公布号 |
EP1649645(A2) |
申请公布日期 |
2006.04.26 |
申请号 |
EP20040752404 |
申请日期 |
2004.05.14 |
申请人 |
BRIGHTMAIL, INC. |
发明人 |
GLEESON, MATT;HOOGSTRATE, DAVID;JENSEN, SANDY;MANTEL, ELI;MEDLAR, ART;SCHNEIDER, KEN |
分类号 |
G06F15/16;H04L12/58 |
主分类号 |
G06F15/16 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|