摘要 |
Speaker independent recognition of small vocabularies, spoken over the long distance telephone network, is achieved using two types of models, one type for defined vocabulary words (e.g., collect, calling-card, person, third-number and operator), and one type for extraneous input which ranges from non-speech soundsto groups of non-vocabulary words (e.g. 'I want to make a collect call please'). For this type of key word spotting, modifications are made to a connected word speech recognition algorithm based on state-transitional (hidden Markov) models which allow it to recognize words from a pre-defined vocabulary list spoken in an unconstrained fashion. Statistical models of both the actual vocabulary words and the extraneous speech and background noises are created. A syntax-driven connected word recognition system is then used to find the best sequence of extraneous input and vocabulary word models for matching the actual input speech.
|