发明名称 BUILDING OF A WEB CORPUS WITH THE HELP OF A REFERENCE WEB CRAWL
摘要 Computer-implemented method for building a web corpus (WCD) comprising the steps of: - sending by a web crawler (WC) a query to a reference web crawl agent (RWCA), this query containing a least one identifier of a resource, - receiving by the web crawler (WC) a response from the reference web crawl agent (RWCA); - if this response does not contain the resource identified by the identifier, downloading by the web crawler (WC) the resource from the website (WS) corresponding to the identifier and adding the resource to the web corpus (WCD; and - if this response contains the resource identified by the identifier, adding the resource to the web corpus (WCD).
申请公布号 CA2812439(A1) 申请公布日期 2013.10.12
申请号 CA20132812439 申请日期 2013.04.12
申请人 EXALEAD 发明人 RICHARD, SEBASTIEN;GREHANT, XAVIER;FERENCZI, JIM
分类号 G06F17/30;H04L12/16 主分类号 G06F17/30
代理机构 代理人
主权项
地址