摘要 |
A method for noise-robust speech processing with cochlea filters within a computer system is disclosed. This invention provides a method for producing feature vectors from a segment of speech, that is more robust to variations in the environment due to additive noise. A first output is produced by convolving (50) a speech signal input with spatially dependent impulse responses that resemble cochlea filters. The temporal transient and the spatial transient of the first output is then enhanced by taking a time derivative (52) and a spatial derivative (54), respectively, of the first output to produce a second output. Next, all the negative values of the second output are replaced (56) with zeros. A feature vector is then obtained (58) from each frame of the second output by a multiple resolution extraction. The parameters for the cochlea filters are finally optimized by minimizing the difference between a feature vector generated from a relatively noise-free speech signal input and a feature vector generated from a noisy speech signal input. <IMAGE> |