发明名称 Method and apparatus for web crawling
摘要 A method and system for retrieving data from a webpage is described herein. A scheduler organizes, or rather orders, a group of webpage identifiers according to some predetermined criteria. Based upon this ordering, a fetcher may be configured to fetch data from webpages identified by the identifiers. To promote efficiency and reduce the latency between when a webpage is updated and when the fetcher retrieves data from the webpage, the scheduler may be configured to reorder the identifiers in such a manner that it causes an identifier that was less relevant, and would not have been sent to the fetcher, to become more relevant. In this way, the method and system may be particularly useful for retrieving data related to webpages that are updated frequently, such as social media webpages, for example.
申请公布号 US8712992(B2) 申请公布日期 2014.04.29
申请号 US20090413528 申请日期 2009.03.28
申请人 MAYKOV ALEXEY;HURST MATTHEW F.;MICROSOFT CORPORATION 发明人 MAYKOV ALEXEY;HURST MATTHEW F.
分类号 G06F7/00;G06F7/08 主分类号 G06F7/00
代理机构 代理人
主权项
地址