Method and system for incremental web crawling,申请号US19990345040-传众专利搜索

首页产品黄页商标征信

会员服务注册登录

法人/股东/高管

发明名称	Method and system for incremental web crawling
摘要	A Web crawler creates an index of documents in a document store on a computer network. In an initial crawl, the crawler creates a first full index for the document store. The first full crawl is based on a set of predefined "seed" URLs and crawl restrictions, and involves recursively retrieving each folder/document directly or indirectly linked to the seed URLs. In the process of creating the first full index, the crawler creates a History Table containing a list of URLs for each folder and document found in the first full crawl. The History Table also includes a local commit time (LCT) for each document and a deleted documents count (DDC) and LCT or maximum LCT (MLCT) for each folder (this assumes that the store supports a folder hierarchy and the MLCT, LCT and DDC properties). Thereafter, in an incremental crawl, the crawler determines, for each folder, (1) whether the DDC for that folder has changed and (2) whether the MLCT is more recent than the corresponding value in the History Table. If the DDC has changed, the crawler obtains a full list of items (URLs) in that folder, and compares the list with the URLs in the History Table to identify the deleted documents. The deleted documents are then deleted from the History Table and index. If the MLCT is more recent, the crawler queries the document store for the URLs of linked documents having a LCT more recent than the MLCT in the History Table for the folder. The History Table and index are then updated accordingly to reflect the changes to the document store.
申请公布号	US6631369(B1)	申请公布日期	2003.10.07
申请号	US19990345040	申请日期	1999.06.30
申请人	MICROSOFT CORPORATION	发明人	MEYERZON DMITRIY;SHOROFF SRIKANTH;TEREK F. SONER;SANU SANKRANT
分类号	G06F17/30;(IPC1-7):G06F17/30	主分类号	G06F17/30
代理机构		代理人
主权项
地址

您可能感兴趣的专利

Apparatus for spreading open tubular blanks for making bottoms in bags

Motion detection system

Packaging apparatus for closing a can with a membrane

HIGH IMPACT LLDPE FILMS

Camera holding device to fasten to a rotatable head

Semiconductor device having lead terminals bent in J-shape

FUSIBLE, CHLORINE-FREE VINYL ACETATE COPOLYMER FIBRE

METHOD AND APPARATUS FOR DELIVERING CALLER IDENTIFICATION INFORMATION AND SERVICES

Superimposition of still pictures consisting of characters

Driving apparatus for plasma display panel

可闭合的通信装置和其工作方法

制备异硫氰酸吡啶基甲酯的方法

XYLANASE, CORRESPONDING RECOMBINANT DNA SEQUENCE, XYLANASE CONTAINING AGENT, AND USE OF THE AGENT

Low-memory line interpolation for a PAL-plus television receiver

PRODUCTION OF PET FOOD AND WET FODDER FOR FUR-BEARING ANIMALS AND FISH BY SURFACE TREATMENT

MULTIPART HINGE

MOVABLE-WINDOW SAFETY DEVICE

DUAL MODE MICROWAVE BAND PASS FILTER MADE OF HIGH QUALITY RESONATORS