发明名称 SCHEDULING RESOURCE CRAWLS
摘要 Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for scheduling resource crawls. In one aspect, a framework is provided for scheduling resource crawls such that a crawl scheduler determines the health of a document, i.e., whether it can be crawled, the popularity of the document, and the frequency of "interesting," i.e., substantive, content changes, and based on this information, estimates an appropriate crawl interval for each web resource to improve crawl resource utilization.
申请公布号 US2013144858(A1) 申请公布日期 2013.06.06
申请号 US201113011426 申请日期 2011.01.21
申请人 LIN ZHEN;STEVENS KEITH;GOOGLE INC. 发明人 LIN ZHEN;STEVENS KEITH
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址