发明名称 System and method for automatically identifying classified websites
摘要 Systems, methods, and computer readable storage mediums are provided to automatically identifying a classified website. A website is determined to be a candidate site based on a set of heuristics. From among pages constituting the candidate site one or more pages are determined to be listing page candidates and one or more pages are determined to be detail page candidates. Then a listing page score is determined using a listing page classifier. Similarly, a detail page score is determined using a detail page classifier. The listing page and detail page scores each indicate the likelihood that the pages are part of a classified website. A candidate site score is determined based in part on a combination of the listing page score and the detail page scores. Then when the candidate site score is above a threshold the candidate site is determined to be a classified website.
申请公布号 US8380693(B1) 申请公布日期 2013.02.19
申请号 US201113228337 申请日期 2011.09.08
申请人 GOOGLE INC.;XU CHENG;FENG GANG;LI XIN 发明人 XU CHENG;FENG GANG;LI XIN
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址