摘要 |
A speech recognizing device performing speech syllable recognition and language word identification. The speech syllable recognition is performed on an ensemble composed of nearly one thousand syllables formed by the human vocal system, which allows for variations caused by language dialects and speech accents. For syllable recognition, the nearly one thousand speech syllables, using a spectrogram-feature-based approach, are parsed in a hierarchical structure based on the region of the vocal system from where the syllable emanated from, root syllable from that vocal region, vowel-caused variation of the root syllable, and syllable duration. The syllable's coded representation includes sub-codes for each of the levels of this hierarchical structure. For identification, speech words composed of sequences of coded syllables are mapped to known language words and their grammatical attribute, using a syllabic dictionary where the same words spoken differently map to a known language word.
|