发明名称 SELECTION OF DOMAIN-ADAPTED TRANSLATION SUBCORPORA
摘要 Architecture that provides the capability to subselect the most relevant data from an out-domain corpus to use either in isolation or in combination conjunction with in-domain data. The architecture is a domain adaptation for machine translation that selects the most relevant sentences from a larger general-domain corpus of parallel translated sentences. The methods for selecting the data include monolingual cross-entropy measure, monolingual cross-entropy difference, bilingual cross entropy, and bilingual cross-entropy difference. A translation model is trained on both the in-domain data and an out-domain subset, and the models can be interpolated together to boost performance on in-domain translation tasks.
申请公布号 US2012203539(A1) 申请公布日期 2012.08.09
申请号 US201113022633 申请日期 2011.02.08
申请人 AXELROD AMITTAI;GAO JIANFENG;HE XIAODONG;MICROSOFT CORPORATION 发明人 AXELROD AMITTAI;GAO JIANFENG;HE XIAODONG
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项
地址