摘要 |
A method is disclosed for use in preprocessing noisy speech to minimize likelihood of error in estimation for use in a recognizer. The computationally-feasible technique, herein called Minimum-Mean-Log-Spectral-Distance (MMLSD) estimation using mixture models and Markov models, comprises the steps of calculating for each vector of speech in the presence of noise corresponding to a single time frame, an estimate of clean speech, where the basic assumptions of the method of the estimator are that the probability distribution of clean speech can be modeled by a mixture of components each representing a different speech class assuming different frequency channels are uncorrelated within each class and that noise at different frequency channels is uncorrelated. In a further embodiment of the invention, the method comprises the steps of calculating for each sequence of vectors of speech in the presence of noise corresponding to a sequence of time frames, an estimate of clean speech, where the basic assumptions of the method of the estimator are that the probability distribution of clean speech can be modeled by a Markov process assuming different frequency channels are uncorrelated within each state of the Markov process and that noise at different frequency channels is uncorrelated. |