发明名称 Spam filtering based on statistics and token frequency modeling
摘要 Embodiments are directed towards classifying messages as spam using a two phased approach. The first phase employs a statistical classifier to classify messages based on message content. The second phase targets specific message types to capture dynamic characteristics of the messages and identify spam messages using a token frequency based approach. A client component receives messages and sends them to the statistical classifier, which determines a probability that a message belongs to a particular type of class. The statistical classifier further provides other information about a message, including, a token list, and token thresholds. The message class, token list, and thresholds are provided to the second phase where a number of spam tokens in a given message for a given message class are determined. Based on the threshold, the client component then determines whether the message is spam or non-spam.
申请公布号 US8364766(B2) 申请公布日期 2013.01.29
申请号 US20080328723 申请日期 2008.12.04
申请人 YAHOO! INC.;ZHENG LEI;NARAYAN SHARAT;RISHER MARK E.;WEI STANLEY KE;RAMARAO VISHWANATH TUMKUR;KUNDU ANIRBAN 发明人 ZHENG LEI;NARAYAN SHARAT;RISHER MARK E.;WEI STANLEY KE;RAMARAO VISHWANATH TUMKUR;KUNDU ANIRBAN
分类号 G06F15/16 主分类号 G06F15/16
代理机构 代理人
主权项
地址