发明名称 Speech decoding and encoding apparatus for lost frame concealment using predetermined number of waveform samples peripheral to the lost frame
摘要 An audio decoding device capable of suppressing an information amount for a lost flame compensation process and encoding efficiency is provided. A decoded sound source generator generates a lost frame's CELP decoded sound source signal. A pitch pulse information decoder CELP decodes a pitch pulse position information and a pitch pulse amplitude information. A pitch pulse waveform learner learns a pitch pulse learning waveform in a past frame in advance from the lost frame. A convolution adjuster amplitude-adjusts the pitch pulse learning waveform according to the pitch pulse amplitude information by considering a predetermined number of waveforms peripheral to a peak position of the lost frame's CELP decoded excitation signal, and convolutes a pitch pulse waveform into a time axis which has been amplitude-adjusted according to the pitch pulse position information. A sound source signal corrector adds or replaces the pitch pulse waveform convoluted into the time axis to the lost flame decoded sound source signal.
申请公布号 US8812306(B2) 申请公布日期 2014.08.19
申请号 US200712307974 申请日期 2007.07.11
申请人 Panasonic Intellectual Property Corporation of America 发明人 Kawashima Takuya;Ehara Hiroyuki;Yoshida Koji
分类号 G10L21/00;H04L1/00;G10L19/005;G10L25/90;H04L12/56 主分类号 G10L21/00
代理机构 Greenblum & Bernstein, P.L.C. 代理人 Greenblum & Bernstein, P.L.C.
主权项 1. A speech decoding apparatus, comprising: a receiver that receives frame loss information identifying a first frame that is a lost frame and speech encoded data of a second frame subsequent to the first frame wherein the speech encoded data includes an encoded excitation signal parameter of the second frame and, if encoded pitch pulse information of the first frame exists, the encoded pitch pulse information of the first frame; a first decoder that decodes the encoded pitch pulse information, if the encoded pitch pulse information exists, to acquire a pitch pulse position and pitch pulse amplitude information of the first frame; a second decoder that decodes the encoded excitation signal parameter to acquire excitation signal parameters; a decoded excitation generation section that performs CELP decoding and concealment processing of a lost frame, and generates a decoded excitation signal using the excitation signal parameters; a pitch pulse waveform learning section that analyzes whether a steady-state is present in the decoded excitation signal, and if the steady-state is present, detects peak position in the decoded excitation signal and extracts a predetermined number of samples peripheral to the peak position and performs learning of a pitch pulse waveform by smoothing the extracted samples with a past pitch pulse learning waveform and generates a learned pitch pulse waveform; an excitation signal compensation section that adjusts the amplitude of the learned pitch pulse waveform according to the pitch pulse amplitude information of the first frame and adds the learned pitch pulse waveform to the decoded excitation signal according to the pitch pulse position of the first frame to generate a compensated excitation signal; and an excitation signal selection section that selects the decoded excitation signal if the frame loss information does not indicates a frame loss and selects the compensated excitation signal if the frame loss information does indicate a frame loss.
地址 Torrance CA US