发明名称 METHOD FOR TRACKING SYNTACTIC PROPERTIES OF A URL
摘要 A method of classifying URLs by analyzing each URL discovered by a crawler and matching against a set of words corresponding to each class such as pornography, archive, obituary, business news, archive, politics, terrorism, etc. A count of the prefix of the URL to the class is updated and an action is performed with respect to electronic documents on the computer system based on the count. The action performed could be blocking the computer system from the crawling, or adjusting the frequency with which the computer system should be crawled.
申请公布号 US2008162448(A1) 申请公布日期 2008.07.03
申请号 US20060617297 申请日期 2006.12.28
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 JALAN PIYOOSH
分类号 G06F17/30;G06F21/20 主分类号 G06F17/30
代理机构 代理人
主权项
地址