摘要 |
For a given sentence grammar, speech recognizers are often required to decode M sets of HMMs each of which models a specific acoustic environment. In order to match input acoustic observations to each of the environments, typically recognition search methods require a network of M sub-networks. A new speech recognition search method is described here, which needs a network that is only the size of a single sub-network and yet provides the same recognition performance, thus reducing the memory requirements for network storage by (M-1)/M.
|