发明名称 Automated access to web content based on log analysis
摘要 The present invention provides a manner for providing Web crawlers capable of efficiently accessing Web content not accessible via static hyperlinks. Log files are maintained of communications between a Web browser and a Web server resulting from real user accesses to the content associated with dynamic hyperlinks. These log files represent past user's accesses to the content and are used to generate Web crawler accesses. This approach allows a crawler to accurately mimic real users, resulting in a capability of the crawler to automatically access all the content that real users would have access to.
申请公布号 US7483910(B2) 申请公布日期 2009.01.27
申请号 US20020042367 申请日期 2002.01.11
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 BEYER KEVIN SCOTT;MYLLYMAKI JUSSI PETRI
分类号 G06F17/00;G06F17/30 主分类号 G06F17/00
代理机构 代理人
主权项
地址