发明名称 System and method for characterizing a web page using multiple anchor sets of web pages
摘要 An improved system and method is provided for characterizing a web page using multiple anchor sets of web pages. To do so, web pages in a collection of unknown web pages may be characterized using known anchor sets of web pages with different characterizations that may be linked to the collection of unknown web pages. A direction and method may be selected for propagating a probability distribution between vertices of a graph representing the collection of web pages and vertices of the anchor sets representing the anchor sets of web pages. Methods for propagating the probability distribution in a forward, backward or bidirectional direction are provided. Various quality measures of the characterization of the vertices are provided using the propagated probability distribution. These various quality measures may be paired and combined in different ways to provide a characterization of the vertices representing the unknown web pages.
申请公布号 US2008082481(A1) 申请公布日期 2008.04.03
申请号 US20060542079 申请日期 2006.10.03
申请人 YAHOO! INC. 发明人 JOSHI AMRUTA SADANAND;RAVIKUMAR SHANMUGASUNDARAM;REED BENJAMIN CLAY;TOMKINS ANDREW
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址