摘要 |
A speech recognition system utilizes both matrix and vector quantizers as front ends to a second stage speech classifier such as hidden Markov models (HMMs) and utilizes neural network postprocessing to, for example, improve speech recognition performance. Matrix quantization exploits the "evolution" of the speech short-term spectral envelopes as well as frequency domain information, and vector quantization (VQ) primarily operates on frequency domain information. Time domain information may be substantially limited which may introduce error into the matrix quantization, and the VQ may provide error compensation. The matrix and vector quantizers may split spectral subbands to target selected frequencies for enhanced processing and may use fuzzy associations to develop fuzzy observation sequence data. A mixer provides a variety of input data to the neural network for classification determination. The neural network's ability to analyze the input data generally enhances recognition accuracy. Fuzzy operators may be utilized to reduce quantization error. Multiple codebooks may also be combined to form single respective codebooks for split matrix and split vector quantization to reduce processing resources demand.
|