发明名称 Method for estimating coverage of Web search engines
摘要 A computerized method is used to estimate the relative coverage of Web search engines. Each search engine maintains an index of words of pages located at specific URL addresses in a network. The method generates a random query. The random query is a logical combination of words found in a subset of the pages. The random query is submitted to a first search engine. In response a set of URLs of pages matching the query are received. Each URL identifies a page indexed by the first search engine that satisfies the random query. A particular URL identifying a sample page is randomly selected. A strong query corresponding to the sample page is generated, and the strong query is submitted to a second search engine. Result information received in response to the strong query is compared to determine if the second search engine has indexed the sample page, or a page substantially similar to the sample page. This procedure is repeated to gather statistical data which is used to estimate the relative sizes and amount of overlap of search engines.
申请公布号 US2005055342(A1) 申请公布日期 2005.03.10
申请号 US20040761800 申请日期 2004.01.21
申请人 BHARAT KRISHNA ASUR;BRODER ANDREI ZARY 发明人 BHARAT KRISHNA ASUR;BRODER ANDREI ZARY
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址