发明名称 METHOD AND APPARATUS FOR FINDING MIRRORED HOSTS
摘要 <p>A method and system that detects mirrored host pairs using information about a large set of pages, including one or more of: URLs, IP addresses, and connectivity information. The identities of the detected mirrored hosts are then saved so that browsers, crawlers, proxy servers, or the like can correctly identify mirrored web sites. The described embodiments of the present invention use one or a combination of techniques to identify mirrors. A first group of techniques involves determining mirrors based on URLs and information about connectivity (i.e., hyperlinks) between pages. A second group of techniques looks at connectivity information at a higher granularity, considering all links from all pages on a host as one group and ignoring the target of each link beyond the host level.</p>
申请公布号 WO2000069142(A2) 申请公布日期 2000.11.16
申请号 US2000012236 申请日期 2000.05.05
申请人 发明人
分类号 主分类号
代理机构 代理人
主权项
地址