摘要 |
This invention provides a method for detection of voice activity or VAD method in a voice signal, particularly in telephonic applications, comprising: a first step aimed at acquiring the voice signal (1) divided in segments or frames having a time duration <i>d</i>, a second step aimed at computing, for each frame, at least three of the following five parameters: the energy differential over the whole band DELTA Ef, the energy differential over the band 0-1kHz, DELTA El, the zero crossing rate differential, DELTA ZCR, the second cepstral coefficient, c2, and the fifth cepstral coefficient, c5, a third step in which a neural network process is carried out in order to provide, based upon at least three of said five parameters, for each frame, an output value Y in the range defined by a minimum value Ymin and by a maximum value Ymax, being Ymin< Ymax. The invention also provides a VAD apparatus to perform said VAD method, a method for segmentation of isolated words or EPD method, including the steps of said VAD method, as well as an EPD apparatus related thereto. |