发明名称 SAMPLING AND OPTIMIZATION IN PHRASE-BASED MACHINE TRANSLATION USING AN ENRICHED LANGUAGE MODEL REPRESENTATION
摘要 Rejection sampling is performed to acquire at least one target language translation for a source language string s in accordance with a phrase-based statistical translation model p(x)=p(t, a|s) where t is a candidate translation, a is a candidate alignment comprising a biphrase sequence generating the candidate translation t, and x is a sequence representing the candidate alignment a. The rejection sampling uses a proposal distribution comprising a weighted finite state automaton (WFSA) q(n) that is refined responsive to rejection of a sample x* obtained in a current iteration of the rejection sampling to generate a refined WFSA q(n+1) for use in a next iteration of the rejection sampling. The refined WFSA q(n+1) is selected to satisfy the criteria p(x)≦q(n+1)(x)≦q(n)(x) for all xεX and q(n+1)(x*)<q(n)(x*) where the space X is the set of sequences x corresponding to candidate alignments a that generate candidate translations t for the source language string s.
申请公布号 US2014214397(A1) 申请公布日期 2014.07.31
申请号 US201313750338 申请日期 2013.01.25
申请人 XEROX CORPORATION 发明人 Dymetman Marc;Aziz Wilker Ferreira;Venkatapathy Sriram
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项 1. A non-transitory storage medium storing instructions executable by an electronic data processing device to perform rejection sampling to acquire at least one accepted target language translation for a source language string s in accordance with a phrase-based statistical translation model p(x)=p(t,a|s) where t is a candidate translation, a is a candidate alignment comprising a source language-target language biphrase sequence generating the candidate translation t, and x is a sequence representing the candidate alignment a, the rejection sampling using a proposal distribution comprising a weighted finite state automaton (WFSA) q(n) that is refined responsive to rejection of a sample x* obtained in a current iteration of the rejection sampling to generate a refined WFSA q(n+1) for use in a next iteration of the rejection sampling wherein the refined WFSA q(n+1) is selected to satisfy the criteria: p(x)≦q(n+1)(x)≦q(n)(x) for all xεX; andq(n+1)(x*)<q(n)(x*); where the space X is the set of sequences x corresponding to candidate alignments a that generate candidate translations t for the source language string s.
地址 Norwalk CT US