发明名称 CONSTRUCTING A TRANSLATION LEXICON FROM COMPARABLE, NON-PARALLEL CORPORA
摘要 A machine translation system may use non-parallel monolingual corpora to generate a translation lexicon. The system may identify identically spelled words in the two corpora, and use them as a seed lexicon. The system may use various clues, e.g., context and frequency, to identify and score other possible translation pairs, using the seed lexicon as a basis. An alternative system may use a small bilingual lexicon in addition to non-parallel corpora to learn translations of unknown words and to generate a parallel corpus.
申请公布号 WO2004001623(A2) 申请公布日期 2003.12.31
申请号 WO2003US09573 申请日期 2003.03.26
申请人 UNIVERSITY OF SOUTHERN CALIFORNIA;MARCU, DANIEL;KNIGHT, KEVIN;MUNTEANU, DRAGOS, STEFAN;KOEHN, PHILIPP 发明人 MARCU, DANIEL;KNIGHT, KEVIN;MUNTEANU, DRAGOS, STEFAN;KOEHN, PHILIPP
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址