发明名称 Method and apparatus for managing a backlog of pending URL crawls
摘要 The technology described relates to reducing a backlog of pending URL crawls in view of a limited URL crawl capacity. This technology is useful for crawling URLs with low latency. Because of the limited crawl capacity, uncrawled URLs from crawl requests are entered into a backlog data structure of pending crawl requests. Various criteria are applied to the URL's that are requested to be crawled, so that less important URL crawls are rejected early from the backlog data structure. This early rejection tends to limit the backlog data structure to the more important pending URL crawls, and tends to keep the average latency low by quickly failing the less important requested URL crawls.
申请公布号 US8676783(B1) 申请公布日期 2014.03.18
申请号 US201113170890 申请日期 2011.06.28
申请人 FEDORYNSKI PAWEL ALEKSANDER;SAMADDAR SUMITRO;GOOGLE INC. 发明人 FEDORYNSKI PAWEL ALEKSANDER;SAMADDAR SUMITRO
分类号 G06F17/30;G06F7/00 主分类号 G06F17/30
代理机构 代理人
主权项
地址