发明名称 |
METHODS AND APPARATUSES FOR PROCESSING A BILINGUAL DATABASE |
摘要 |
Aligned corpora (206, CORPE, CORPF) are generated or received from an external source. Each corpus comprises a set of portions aligned with corresponding portions of the other corpus, for example, so that aligned portions are nominally translations of one another in two natural languages. A statistical database (210) is compiled. An evaluation module (212) calculates correlation scores for pairs of words chosen one from each corpus. Given a pair of text portions (one in each language) the evaluation module (212) combines word pair correlation scores to obtain an alignment score for the text portions. These alignment scores can be used to verify a translation (230) an/or to modify the aligned corpora (206) to remove improbable alignments.
|
申请公布号 |
WO9500912(A1) |
申请公布日期 |
1995.01.05 |
申请号 |
WO1994GB01321 |
申请日期 |
1994.06.17 |
申请人 |
CANON RESEARCH CENTRE EUROPE LTD.;CANON EUROPA N.V.;O'DONOGHUE, TIMOTHY, FRANCIS |
发明人 |
O'DONOGHUE, TIMOTHY, FRANCIS |
分类号 |
G06F17/28;G06F17/30;(IPC1-7):G06F15/38 |
主分类号 |
G06F17/28 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|