发明名称 Estimation system of spectral envelopes and group delays for sound analysis and synthesis, and audio signal synthesis system
摘要 For high-accuracy analysis and high-quality synthesis of voice sound (singing and speech), provided herein are a system and a method for estimating from an audio signal spectral envelopes and group delays for sound analysis and synthesis with high accuracy and high temporal resolution. An estimation system of spectral envelopes and group delays includes a fundamental frequency estimation section, an amplitude spectrum acquisition section, a group delay extraction section, a spectral envelope integration section, and a group delay integration section. The spectral envelope integration section sequentially obtains a spectral envelope for sound synthesis by averaging overlapped spectra. The group delay integration section selects from a plurality of group delays a group delay corresponding to the maximum envelope of each frequency component of the spectral envelope and integrates groups delays thus selected to sequentially obtain a group delay for sound synthesis.
申请公布号 US9368103(B2) 申请公布日期 2016.06.14
申请号 US201314418680 申请日期 2013.07.30
申请人 NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL SCIENCE AND TECHNOLOGY 发明人 Nakano Tomoyasu;Goto Masataka
分类号 G10L25/90;G10L13/02;G10L25/18;G10L25/45;G10L25/15;G10L25/78;G10L21/013;G10L19/022 主分类号 G10L25/90
代理机构 Rankin, Hill & Clark LLP 代理人 Rankin, Hill & Clark LLP
主权项 1. An estimation system of spectral envelopes and group delays for sound analysis and synthesis comprising at least one processor operable to function as: a fundamental frequency estimation section configured to estimate F0s from an audio signal at all points of time or at all points of sampling; an amplitude spectrum acquisition section configured to divide the audio signal into a plurality of frames, centering on each point of time or each point of sampling, by using a window having a window length changing with F0 at each point of time or each point of sampling, to perform Discrete Fourier Transform (DFT) analysis on the plurality of frames of the audio signal, and thus to acquire amplitude spectra at the respective frames; a group delay extraction section configured to extract group delays as phase frequency differentials at the respective frames by performing a group delay extraction algorithm accompanied by DFT analysis on the plurality of frames of the audio signal; a spectral envelope integration section configured to obtain overlapped spectra at a predetermined time interval by overlapping the amplitude spectra corresponding to the frames included in a certain period determined based on a fundamental period of F0, and to average the overlapped spectra to sequentially obtain a spectral envelope for sound synthesis; and a group delay integration section configured to select a group delay corresponding to a maximum envelope for each frequency component of the spectral envelope from the group delays at a predetermined time interval, and to integrate the thus selected group delays to sequentially obtain a group delay for sound synthesis.
地址 Tokyo JP