发明名称 Identifying related documents based on links in documents
摘要 A device may identify, in a first document, a reference to a second document, the second document being different than the first document; identify that the reference to the second document is associated with a relation indicator; determine, based on identifying that the reference to the second document includes a relation indicator, that content of the second document is related to content of the first document; and process the second document based on determining that content of the second document is related to content of the first document.
申请公布号 US8892596(B1) 申请公布日期 2014.11.18
申请号 US201213569948 申请日期 2012.08.08
申请人 Google Inc. 发明人 Semturs Christopher;Prahladka Piyush
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Harrity & Harrity, LLP 代理人 Harrity & Harrity, LLP
主权项 1. A method comprising: identifying, in a first document and by one or more processors of one or more server devices, a reference to a second document, the second document being different than the first document; identifying, by one or more processors of the one or more server devices, that the reference to the second document is associated with a relation indicator, the relation indicator being associated with a link that references the second document; determining, based on identifying that the reference to the second document is associated with the relation indicator and by one or more processors of the one or more server devices, that content of the second document is related to content of the first document, the determining that the content of the second document is related to the content of the first document comprising: translating the first document to obtain a translated first document, the translated first document being in a language that matches a language of the second document;comparing the translated first document to the second document to obtain a measure of similarity between the translated first document and the second document; anddetermining, based on the comparing, that the content of the second document is related to the content of the first document when the measure of similarity satisfies a particular similarity threshold; and processing, by one or more processors of the one or more server devices, the second document based on determining that the content of the second document is related to the content of the first document.
地址 Moutain View CA US