发明名称 Content signature notification
摘要 A client application installed on end user computers generates metadata from the content of web pages visited by end users and provides the metadata to a search engine. When an end user visits a web page, the end user's computer downloads and displays the web page to the end user. The client application may simultaneously access the web page content and generate this metadata in the form of a content signature of the web page from the web page content. The client application then provides the content signature to a search engine. The search engine may employ content signatures to identify new web pages to crawl and index. Additionally, the search engine may employ content signatures to identify changes to web pages and determine the crawl frequency of web pages.
申请公布号 US9043306(B2) 申请公布日期 2015.05.26
申请号 US201012861788 申请日期 2010.08.23
申请人 MICROSOFT TECHNOLOGY LICENSING, LLC 发明人 Canel Fabrice;Ahmed Junaid;McElroy Thomas Francis;Sun Walter;Chellapilla Kumar;Singh Abhishek;Challam Vishnu
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人 Ream Dave;Barker Doug;Minhas Micky
主权项 1. A search engine server comprising: one or more processors; and one or more computer storage media storing computing-useable instructions that, when used by the one or more processors, cause the one or more processors to: receive, at the search engine server from an end user computing device, a plurality of content signatures of a web page, the plurality of content signatures having been automatically generated at the end user computing device from content of the web page downloaded and displayed by the end user computing device, the web page having been downloaded by the end user computing device from one or more content servers separate from the search engine server when an end user employed the end user computing device to access the web page during a web browsing session, each content signature corresponding with a different portion of the content of the web page, each portion comprising one of text, images, video, and audio; analyze, at the search engine server, the plurality of content signatures to identify a portion of the content that has changed on the web page by determining a difference between at least one of the plurality of content signatures and a content signature accessible by the search engine server; and control crawling of the web page by the search engine server based on the portion of the content that has changed on the web page receive, at the search engine server from an end user's computing device, a content signature of a web page, the content signature comprising a representation of the web page automatically generated from the content of the web page by a client application on the end user's computing device, the web page having been downloaded by the end user's computing device from a content server separate from the search engine server when an end user employed the end user's computing device to access the web page during a web browsing session; compare, at the search engine server, the content signature of the web page received from the end user's computing device to a second content signature of the web page accessible to the search engine server; determine, at the search engine server, a location of content within the web page that has changed based on a difference determined between the content signature of the web page received from the end user's computing device and the second content signature of the web page accessible to the search engine server; and crawl the web page based on determining the location of content within the web page that has changed.
地址 Redmond WA US