发明名称 |
Method and Apparatus for Processing Speech Signal According to Frequency-Domain Energy |
摘要 |
A method and an apparatus for processing a speech signal according to frequency-domain energy where the method and apparatus include receiving an original speech signal including a first speech frame and a second speech frame that are adjacent to each other, performing a Fourier transform on the first speech frame and the second speech frame, obtaining a frequency-domain energy distribution of the first speech frame and the second speech frame, obtaining a frequency-domain energy correlation coefficient, and segmenting the original speech signal according to the frequency-domain energy correlation coefficient. Hence a problem that a speech signal segmentation result has low accuracy due to a characteristic of a phoneme of a speech signal or severe impact of noise when refined speech signal segmentation is performed may be resolved. |
申请公布号 |
US2016351204(A1) |
申请公布日期 |
2016.12.01 |
申请号 |
US201615237095 |
申请日期 |
2016.08.15 |
申请人 |
Huawei Technologies Co., Ltd. |
发明人 |
Xu Lijing |
分类号 |
G10L21/0308;G10L25/18;G10L25/06 |
主分类号 |
G10L21/0308 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method for processing a speech signal according to frequency-domain energy, comprising:
receiving an original speech signal, wherein the original speech signal comprises a first speech frame and a second speech frame that are adjacent to each other; performing a Fourier transform on the first speech frame to obtain a first frequency-domain signal; performing the Fourier transform on the second speech frame to obtain a second frequency-domain signal; obtaining a frequency-domain energy distribution of the first speech frame according to the first frequency-domain signal; obtaining a frequency-domain energy distribution of the second speech frame according to the second frequency-domain signal, wherein the frequency-domain energy distribution represents an energy distribution characteristic of the speech frame in a frequency domain; obtaining a frequency-domain energy correlation coefficient between the first speech frame and the second speech frame according to the frequency-domain energy distribution of the first speech frame and the frequency-domain energy distribution of the second speech frame, wherein the frequency-domain energy correlation coefficient is used to represent a spectral change from the first speech frame to the second speech frame; and segmenting the original speech signal according to the frequency-domain energy correlation coefficient. |
地址 |
Shenzhen CN |