发明名称 Pitch synchronous speech coding based on timbre vectors
摘要 A pitch-synchronous method and system for speech coding using timbre vectors is disclosed. On the encoder side, speech signal is segmented into pitch-synchronous frames without overlap, then converted into a pitch-synchronous amplitude spectrum using FFT. Using Laguerre functions, the amplitude spectrum is transformed into a timbre vector. Using vector quantization, each timbre vector is converted to a timbre index based on a timbre codebook. The intensity and pitch are also converted into indices respectively using scalar quantization. Those indices are transmitted as encoded speech. On the decoder side, by looking up the same codebooks, pitch, intensity and the timbre vector are recovered. Using Laguerre functions, the amplitude spectrum is recovered. Using Kramers-Kronig relations, the phase spectrum is recovered. Using FFT, the elementary waves are regenerated, and superposed to become the speech signal.
申请公布号 US9135923(B1) 申请公布日期 2015.09.15
申请号 US201514605571 申请日期 2015.01.26
申请人 发明人 Chen Chengjun Julian
分类号 G10L19/02;G10L19/125;G10L19/038;G10L19/035;G10L25/90;G10L19/00 主分类号 G10L19/02
代理机构 代理人
主权项 1. A method of speech communication from a transmitter to a receiver using a plurality of processors comprising an encoder to compress the speech signal into a digital form and a decoder to recover speech signal from the said compressed digital form comprising: (A) an encoder in the transmitter comprising the following elements: segment the voice-signal into non-overlapping frames, wherein for voiced sections the frames are pitch periods and for unvoiced sections the frame duration is a constant; identify the type of a said frame to generate a type index; identify the pitch period of a said frame from the segmentation process; generate amplitude spectra of a said frame using Fourier analysis; generate an intensity parameter of a said frame from the amplitude spectrum; transform the said amplitude spectrum into timbre vectors using Laguerre functions; apply vector quantization to the said timbre vector using a timbre-vector codebook to generate a timbre index; apply scalar quantization to said intensity parameter using an intensity codebook to generate an intensity index; apply scalar quantization to said pitch period with a pitch codebook to generate a pitch index; transmit the type index, intensity index, pitch index and timbre index to the receiver; (B) a decoder in the receiver comprising the following elements: take the transmitted intensity index, look-up into the intensity codebook to identify the intensity; take the transmitted pitch index, look-up into the pitch codebook to identify the pitch; take the transmitted timbre index, look-up into the timbre-vector codebook to identify the timber vector; inverse transform the said timbre vector into amplitude spectra using Laguerre functions; generate phase spectrum from the amplitude spectrum using Kramers-Knonig relations; use fast Fourier transform to generate an elementary waveform from the said amplitude spectrum, phase spectrum, and intensity; superpose the said elementary waves according to the timing provided by the pitch period to generate an output speech signal.
地址