摘要 |
<p>PURPOSE:To obtain the high rate of recognition with comparatively short uttering by totally judging the sequence of an output vector with the sequence of the vector, which expresses the outline of a short-time spectrum extracted from an input sound, as an input to a neural network (NN). CONSTITUTION:The vector expressing the outline of the short-time spectrum is calculated from the sound for learning through a preprocessing part 12 and the NN is constructed by inputting this sequence to an NN part 13. At the time of recognition, this processing part 12 calculates the vector expressing the outline of the short-time spectrum from the sound of an unspecified speaker similarly to the time of learning, and the sequence of the output vector is obtained by inputting this sequence of the vector to the NN. At this time, the respective output vectors show the speakers in respect to the inputs during a short time and totally judged by a decision part 15 based on the majority, sum or product of all the sequences from an output vector calculation part 14 as a whole. Thus, one result of speaker recognition can be obtained and the speaker can be recognized with short uttering based on the input sound not to limit uttered contents.</p> |