摘要 |
The technology described relates to reducing a backlog of pending URL crawls in view of a limited URL crawl capacity. This technology is useful for crawling URLs with low latency. Because of the limited crawl capacity, uncrawled URLs from crawl requests are entered into a backlog data structure of pending crawl requests. Various criteria are applied to the URL's that are requested to be crawled, so that less important URL crawls are rejected early from the backlog data structure. This early rejection tends to limit the backlog data structure to the more important pending URL crawls, and tends to keep the average latency low by quickly failing the less important requested URL crawls. |