发明名称 Building scalable N-gram language models using maximum likelihood maximum entropy N-gram models
摘要 The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.
申请公布号 US5467425(A) 申请公布日期 1995.11.14
申请号 US19930023543 申请日期 1993.02.26
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 LAU, RAYMOND;ROSENFELD, RONALD;ROUKOS, SALIM
分类号 G10L15/06;G06F17/28;G10L15/10;G10L15/18;G10L15/28;(IPC1-7):G10L9/00 主分类号 G10L15/06
代理机构 代理人
主权项
地址