发明名称 Techniques for performing language detection and translation for multi-language content feeds
摘要 A technique for translating a portion of a website includes receiving a language set of a user indicating a primary language of the user is received. A content feed to be displayed to the user is received and parsed to identify a text portion of a user generated content. The original language of the text portion is determined and compared with the one or more languages in the language set. When the original language of the text portion does not match the languages in the language set: (i) the text portion, the original language, and the primary language are provided to a translation engine, (ii) a translated version of the text portion is received from the translation engine, (iii) the translated version of the text portion is inserted into the content feed to obtain a modified content feed, and (iv) the modified content feed is displayed to the user.
申请公布号 US8812295(B1) 申请公布日期 2014.08.19
申请号 US201113279568 申请日期 2011.10.24
申请人 Google Inc. 发明人 Swerdlow Andrew;Jagpal Navdeep Singh
分类号 G06F17/20 主分类号 G06F17/20
代理机构 Remarck Law Group PLC 代理人 Remarck Law Group PLC
主权项 1. A computer implemented method for translating a portion of a website comprising: determining whether the website comprises a parsable content feed including user generated content having at least one text portion; and when the website is determined to comprise the parsable content feed: receiving a language set of a viewing user, the language set including one or more languages and a primary language indicator that identifies one of the one or more languages as a primary language of the viewing user;receiving an aggressiveness setting of the viewing user;parsing a document object model representing the parsable content feed;determining a user generated content element in the document object model, the user generated content element representing the user generated content; anddetermining a text portion element of the user generated content element, the text portion element containing the at least one text portion;determining an original language of the at least one text portion by: (i) providing the at least one text portion to a language determination engine and receiving a potential language classification for the at least one text portion and a confidence score associated therewith,(ii) comparing the confidence score to a threshold, wherein the threshold is based on the received aggressiveness setting,(iii) when the confidence score is greater than the threshold, adopting the potential language classification of the at least one text portion as the original language, and(iv) when the confidence score is less than the threshold, adopting one of the one or more languages in the language set as the original language such that original language of the at least one text portion matches one of the one or more languages in the language set;comparing the original language of the at least one text portion with the one or more languages in the language set; andwhen the original language of the at least one text portion does not match one of the one or more languages in the language set:(i) providing, to a translation engine, the at least one text portion, the original language of the at least one text portion and the primary language indicator,(ii) receiving, from the translation engine, a translated version of the least one text portion, the translated version corresponding to a translation of the least one text portion from the original language to the primary language of the viewing user,(iii) inserting the translated version of the least one text portion into the parsable content feed to obtain a modified content feed, and(iv) providing the modified content feed for display to the viewing user.
地址 Mountain View CA US