摘要 |
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a plurality of web page images in a target web page; identifying a plurality of hyperlinks in other web pages, wherein each of the hyperlinks (i) links to the target web page and (ii) includes a respective image tag for a respective thumbnail image; determining a visual similarity score for each of the plurality of web page images with reference to the thumbnail images; identifying a first web page image in the plurality of web page images that has a highest visual similarity score with reference to the thumbnail images that satisfies a minimum similarity threshold; and labeling the first web page image as a primary image for the target web page. |