发明名称 |
Building A Translation Lexicon From Comparable, Non-Parallel Corpora |
摘要 |
A machine translation system may use non-parallel monolingual corpora to generate a translation lexicon. The system may identify identically spelled words in the two corporal and use them as a seed lexicon. The system may use various clues 1 e.g., context and frequency, to identify and score other possible translation pairs 1 using the seed lexicon as a basis. An alternative system may use a small bilingual lexicon in addition to non-parallel corpora to learn translations of unknown words and to generate a parallel corpus.
|
申请公布号 |
US2010042398(A1) |
申请公布日期 |
2010.02.18 |
申请号 |
US20090576110 |
申请日期 |
2009.10.08 |
申请人 |
MARCU DANIEL;KNIGHT KEVIN;MUNTEANU DRAGOS STEFAN;KOHEN PHILIPP |
发明人 |
MARCU DANIEL;KNIGHT KEVIN;MUNTEANU DRAGOS STEFAN;KOHEN PHILIPP |
分类号 |
G06F17/28;G06F17/21;G06F17/27 |
主分类号 |
G06F17/28 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|