发明名称 |
Apparatus and method for forming a filtered inflected language model for automatic speech recognition |
摘要 |
A method of forming a language model for a language having a selected vocabulary of word forms comprises: (a) mapping the word forms into integer vectors in accordance with frequencies of word form occurrence; (b) partitioning the integer vectors into subsets, the subsets respectively having ranges of frequencies of word form occurrence associated therewith, the subsets being arranged in a descending order of frequency ranges; (c) respectively assigning maps to the subsets; (d) filtering a textual corpora using the maps assigned to the subsets in order to generate indexed integers; (e) determining n-gram statistics for the indexed integers; and (f) estimating n-gram language model probabilities from the n-gram statistics to form the language model.
|
申请公布号 |
US6073091(A) |
申请公布日期 |
2000.06.06 |
申请号 |
US19970906812 |
申请日期 |
1997.08.06 |
申请人 |
INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
KANEVSKY, DIMITRI;MONKOWSKI, MICHAEL DANIEL;SEDIVY, JAN |
分类号 |
G10L15/18;(IPC1-7):G06F17/28;G10L5/06;G10L9/00 |
主分类号 |
G10L15/18 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|