摘要 |
<p>A method for compressing a language model that comprises a plurality of N-grams and associated N-gram probabilities. The method comprises forming at least one group of N-grams from the plurality of N-grams; sorting N-gram probabilities associated with the N-grams of the at least one group of N-grams; and determining a compressed representation of the sorted N-gram probabilities. The at least one group of N-grams may be formed from N-grams of the plurality of N- grams that are conditioned on the same (N-I) -tuple of preceding words . The compressed representation of the sorted N-grair. probabilities may be a sampled representation of the sorted N-gram probabilities or may comprise an index into a codebook. The invention further relates to an according computer program product and device, to a storage medium for at least partially storing a language model, and to a device for processing data at least partially based on a language model.</p> |