发明名称 Computerized statistical machine translation with phrasal decoder
摘要 A computerized system for performing statistical machine translation with a phrasal decoder is provided. The system may include a phrasal decoder trained prior to run-time on a monolingual parallel corpus, the monolingual parallel corpus including a machine translation output of source language documents of a bilingual parallel corpus and a corresponding target human translation output of the source language documents, to thereby learn mappings between the machine translation output and the target human translation output. The system may further include a statistical machine translation engine configured to receive a translation input and to produce a raw machine translation output, at run-time. The phrasal decoder may be configured to process the raw machine translation output, and to produce a corrected translation output based on the learned mappings for display on a display associated with the system.
申请公布号 US9176952(B2) 申请公布日期 2015.11.03
申请号 US200812238207 申请日期 2008.09.25
申请人 MICROSOFT TECHNOLOGY LICENSING, LLC 发明人 Aikawa Takako;Ruopp Achim
分类号 G06F17/28;G06F17/27 主分类号 G06F17/28
代理机构 代理人 Swain Sandy;Minhas Micky
主权项 1. A computerized system for performing statistical machine translation, the system comprising: a statistical machine translation engine executed on a user computing device, the statistical machine translation engine trained on a bilingual parallel corpus including source language documents and a corresponding target human translation of the source language documents, and configured to receive a translation input and to produce a raw machine translation output, at run-time; a phrasal decoder, separate and distinct from the statistical machine translation engine, executed on the user computing device, the phrasal decoder being trained prior to run-time on a monolingual parallel corpus, the monolingual parallel corpus including a machine translation output of the source language documents of the bilingual parallel corpus and the corresponding target human translation output of the source language documents of the bilingual parallel corpus, to thereby learn mappings and build a phrase table by establishing phrase pairs between the machine translation output and the target human translation output, wherein the machine translation output is unedited by human translators, and wherein the phrasal decoder is trained prior to run-time on a developer computing device on which the bilingual parallel corpus is stored, assigning to each phrase pair a statistical score representing a utility of each phrase pair; and wherein at run-time on the user computing device the phrasal decoder is configured to process the raw machine translation output, and to produce a corrected translation output based on the learned mappings and the phrase table, programmatically correcting the raw machine translation output if a statistical score for correspondence of the phrase pair is above a predetermined threshold.
地址 Redmond WA US