摘要 |
A trained vector generation section 16 generates beforehand a trained vector v of unvoiced sounds. An LPC Cepstrum analysis section 18 generates a feature vector A of a voice within the non-voice period, an inner product operation section 19 calculates an inner product value V<SUP>T</SUP>A between the feature vector A and the trained vector V, and a threshold generation section 20 generates a threshold thetav on the basis of the inner product value V<SUP>T</SUP>A. Also, the LFC Cepstrum analysis section 18 generates a prediction residual power epsilon of the signal within the non-voice period, and the threshold generation section 22 generates a threshold THD on the basis of the prediction residual power epsilon. If the voice is actually uttered, the LPC Cepstrum analysis section 18 generates the feature vector A and the prediction residual power epsilon, the inner product operation section 19 calculates an inner product value V<SUP>T</SUP>A between the feature vector A of input signal Saf and the trained vector V, and a threshold determination section 21 compares the inner product value V<SUP>T</SUP>A with the threshold thetav and determines the voice section if thetav<=V<SUP>T</SUP>A. Also, a threshold determination section 23 compares the prediction residual power epsilon of input signal Saf with the threshold THD and determines the voice section if THD<=epsilon. The voice section is finally defined if thetav<=V<SUP>T</SUP>A or THD<=epsilon, and the input signal Svc for voice recognition is extracted.
|