发明名称 |
INCREMENTAL CRAWLING OF MULTIPLE CONTENT PROVIDERS USING AGGREGATION |
摘要 |
A method for incremental crawling of content stored on a plurality of content providers using aggregation is provided. The method comprises receiving a request to crawl content on one or more associated content providers; retrieving one or more first references to content on a first content provider; retrieving one or more second references to content on one or more second content providers during the same request; aggregating the first and second references; and returning the aggregated first and second references. This is done while taking into consideration opaque timestamp object which is managed in a distributed manner. The opaque timestamp is filled in by the content providers but stored in the crawler side between crawling sessions. |
申请公布号 |
US2014317083(A1) |
申请公布日期 |
2014.10.23 |
申请号 |
US201414318804 |
申请日期 |
2014.06.30 |
申请人 |
International Business Machines Corporation |
发明人 |
Kenig Batya;Radchenko Constantin;Shapiro Eitan |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
Armonk NY US |