摘要 |
A speech recognition system is realized by applying a speech signal to a feature extractor wherein a sequence of predetermined features representing the speech signal are determined, and by comparing the determined features, in an acceptor, to predetermined sequences of features which represent selected words. One attribute of the feature extractor is the ability to represent certain classes of sounds in terms of the position and direction of motion of the movable structures of a human vocal tract model, such as the position and direction of movement of the speaker's tongue body. The tongue body position is derived by determining the formant frequencies in the applied speech signal and by employing the Coker vocal tract model to find the tongue body position which best matches the determined formants. |