摘要 |
The present disclosure relates to a method, system and software executable by a processor associated with non-transitory computer-readable storage medium for detecting a trap of web-based calendar pages and building a retrieval database. According to an aspect of the disclosure, detecting a trap of web-based calendar pages includes clustering, by a clustering module, URLs corresponding to web pages stored in a database according to a predetermined standard, generating a regular expression by analyzing a date pattern included in a clustering result, and detecting, a cluster suspected of being a trap of web-based perpetual calendar pages using the generated regular expression. |