发明名称 Pitch period segmentation of speech signals
摘要 A method for automatic segmentation of pitch periods of speech waveforms takes a speech waveform, a corresponding fundamental frequency contour of the speech waveform, that can be computed by some standard fundamental frequency detection algorithm, and optionally the voicing information of the speech waveform, that can be computed by some standard voicing detection algorithm, as inputs and calculates the corresponding pitch period boundaries of the speech waveform as outputs by iteratively •calculating the Fast Fourier Transform (FFT) of a speech segment having a length of approximately two periods, the period being calculated as the inverse of the mean fundamental frequency associated with these speech segments, •placing the pitch period boundary either at the position where the phase of the third FFT coefficient is −180 degrees, or at the position where the correlation coefficient of two speech segments shifted within the two period long analysis frame maximizes, or at a position calculated as a combination of both measures stated above, and repeatedly shifting the analysis frame one period length further until the end of the speech waveform is reached.
申请公布号 US9196263(B2) 申请公布日期 2015.11.24
申请号 US201013520034 申请日期 2010.12.29
申请人 Synvo Gmbh 发明人 Romsdorfer Harald
分类号 G10L21/00;G10L25/90 主分类号 G10L21/00
代理机构 Smith Risley Tempel Santos LLC 代理人 Blaha Robert A.;Smith Risley Tempel Santos LLC
主权项 1. A method for automatic segmentation of pitch periods of speech waveforms, the method comprising: taking the speech waveform and the corresponding fundamental frequency contour of the speech waveform as inputs; and calculating the corresponding pitch period boundaries of the speech waveform as outputs by iteratively calculating the Fast Fourier Transform (FFT) of a speech segment of approximately two period length, calculated as the inverse of the mean fundamental frequency associated with these speech segments, placing the pitch period boundary at the position where the phase of the third FFT coefficient is −180 degrees, and shifting the analysis frame one period length further until the end of the speech waveform is reached.
地址 Leoben AT