摘要 |
In the recognition phase, the signal from a sensor (10) is processed to obtain parameters which are compared with those stored in a dictionary (16) in the training phase in order to recognise the vocal structures uttered by the user in a noisy environment. The obtaining of the said parameters during the training and recognition phases includes the forming of digital frames (S(n)) of predetermined length on the basis of the signal from the sensor, the transforming of each frame from the time domain to the frequency domain to obtain a spectrum X(i), and the applying of an inverse transformation, from the frequency domain to the time domain, to the quantity ¦X(i)¦< gamma >, where ¦X(i)¦ represents the modulus of the spectrum and gamma represents an appropriate exponent. <IMAGE> |