发明名称 Token-Level Interpolation For Class-Based Language Models
摘要 Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.
申请公布号 US2016267905(A1) 申请公布日期 2016.09.15
申请号 US201514644976 申请日期 2015.03.11
申请人 Microsoft Technology Licensing, LLC 发明人 Levit Michael;Parthasarathy Sarangarajan;Stolcke Andreas;Chang Shuangyu
分类号 G10L15/183 主分类号 G10L15/183
代理机构 代理人
主权项 1. An automatic speech recognition (ASR) system comprising: an acoustic sensor configured to convert speech into acoustic information; an acoustic model (AM) configured to convert the acoustic information into a first corpus of words; and a language model (LM) configured to convert the first corpus of words into plausible word sequences, the LM determined from an interpolation of a plurality of component LMs and corresponding set of coefficient weights, wherein at least one of the component LMs is class-based, and wherein the interpolation is context-specific.
地址 Redmond WA US