摘要 |
A system and method are proposed for measuring confusability or similarity between given entry pairs, including text string pairs and acoustic model pairs, in systems such as speech recognition and synthesis systems. A string edit distance (Levenshiten distance) can be applied to measure distance between any pair of text strings. It also can be used to calculate a confusion measurement between acoustic model pairs of different words and a model-driven method can be used to calculate a HMM model confusion matrix. This model-based approach can be efficiently calculated with low memory and low computational resources. Thus it can improve the speech recognition performance and models trained from text corpus.
|