发明名称 |
Automated access to web content based on log analysis |
摘要 |
The present invention provides a manner for providing Web crawlers capable of efficiently accessing Web content not accessible via static hyperlinks. Log files are maintained of communications between a Web browser and a Web server resulting from real user accesses to the content associated with dynamic hyperlinks. These log files represent past user's accesses to the content and are used to generate Web crawler accesses. This approach allows a crawler to accurately mimic real users, resulting in a capability of the crawler to automatically access all the content that real users would have access to.
|
申请公布号 |
US7483910(B2) |
申请公布日期 |
2009.01.27 |
申请号 |
US20020042367 |
申请日期 |
2002.01.11 |
申请人 |
INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
BEYER KEVIN SCOTT;MYLLYMAKI JUSSI PETRI |
分类号 |
G06F17/00;G06F17/30 |
主分类号 |
G06F17/00 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|