发明名称 SYSTEM FOR WEB CRAWLING AND METHOD THEREOF
摘要 PURPOSE: A system for web crawling and a method thereof are provided to remarkably reduce the time taken for web crawling by simultaneously downloading external link pages linked to the webpage which has the highest access probability. CONSTITUTION: A seed page priority assigner(11) sets up standard seed pages for web crawling, produces access probability of the seed pages detected through the web crawling and gives priority to the seed page. A downloader(12) downloads the seed page having the highest priority and outlink pages linked to the seed page collectively. An outlink page priority assigner(13) produces access possibility of the seed page and gives the priority to an external link page.
申请公布号 KR20100094263(A) 申请公布日期 2010.08.26
申请号 KR20090013597 申请日期 2009.02.18
申请人 KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION 发明人 LEE, SANG KEUN;HIJBUL MD. ALAM;HA, JONG WOO
分类号 G06F17/30;G06F17/00;G06F17/21 主分类号 G06F17/30
代理机构 代理人
主权项
地址