发明名称 |
System and Method for Speech Recognition Using Pitch-Synchronous Spectral Parameters |
摘要 |
The present invention defines a pitch-synchronous parametrical representation of speech signals as the basis of speech recognition, and discloses methods of generating the said pitch-synchronous parametrical representation from speech signals. The speech signal is first going through a pitch-marks picking program to identify the pitch periods. The speech signal is then segmented into pitch-synchronous frames. An ends-matching program equalizes the values at the two ends of the waveform in each frame. Using Fourier analysis, the speech signal in each frame is converted into a pitch-synchronous amplitude spectrum. Using Laguerre functions, the said amplitude spectrum is converted into a unit vector, referred to as the timbre vector. By using a database of correlated phonemes and timbre vectors, the most likely phoneme sequence of an input speech signal can be decoded in the acoustic stage of a speech recognition system. |
申请公布号 |
US2014200889(A1) |
申请公布日期 |
2014.07.17 |
申请号 |
US201414216684 |
申请日期 |
2014.03.17 |
申请人 |
Chen Chengjun Julian |
发明人 |
Chen Chengjun Julian |
分类号 |
G10L15/18;G10L25/93;G10L25/90 |
主分类号 |
G10L15/18 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method of automatic speech recognition to convert speech signal into text using one or more processors comprising:
A) segmenting the speech signal into pitch-synchronous frames, wherein for voiced sections each said frame is a single pitch period; B) for each frame, equalizing the two ends of the waveform using an ends-matching program; C) generating an amplitude spectrum of each said frame using Fourier analysis; D) transforming each said amplitude spectrum into a timbre vector using Laguerre functions; E) performing acoustic decoding to find a list of most likely phonemes or sub-phoneme units for each said timbre vector by comparing with a timbre vector database; F) decoding the sequence of the list of the most likely phonemes or sub-phoneme units using a language-model database to find out the most likely text. |
地址 |
White Plains NY US |