主权项 |
1. A computer system optimized to recognize human speech comprising a first module configured to receive human speech; a second module configured to morph said human speech, where said human speech is to have the same prosody and duration as the output of a text to speech (“TTS”) engine; a third module configured as an Automatic Speech Recognize (“ASR”) comprising a language model and acoustic model, where the acoustic model is created by training from said TTS, a fourth module configured as TTS, where TTS trains said ASR, a fifth module configured as a text source for a speech corpus, where the speech corpus is a sequence of phonetic transcriptions, and a sixth module for outputting the text. |