摘要 |
A computer-implemented method analyzes a website to determine whether the website represents a potential source of spam, and, in response to the analyzing, flags content of the website as spam content. The determination can be made by computing a total number of content items associated with the website, calculating a publication frequency of the content items, and determining whether the website in its entirety represents spam content based on the total number and the publication frequency. The determination could also be made by generating a signature of a webpage containing a content item, obtaining an occurrence count for the generated characterizing signature, and, when the obtained occurrence count is greater than a threshold count, identifying the content item as spam. |