发明名称 Method for learning character patterns to interactively control the scope of a web crawler
摘要 A method controls a Web search for server computer resources by an end-user Web crawler. Each resource, such as a Web page, is located by a resource address specified as a character string. The end-user defines a scope for an initial Web search by settings. The settings are used to search the Web for resources limited by the scope. The set of resources located during the search are rendered on output device, and positive and negative examples are selected from the set of resources to infer a rule. The rule is displayed, as well as a subset of resources that match on the rule. The selecting, inferring, and rendering steps are repeated while searching until a final rule is obtained. The rule matches resources that the crawler should process and does not match resource that it should avoid.
申请公布号 US6411952(B1) 申请公布日期 2002.06.25
申请号 US19980103904 申请日期 1998.06.24
申请人 COMPAQ INFORMATION TECHNOLOGIES GROUP, LP 发明人 BHARAT KRISHNA ASUR;MILLER ROBERT CHISOLM
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址