摘要 |
PROBLEM TO BE SOLVED: To provide a method for estimating a pronunciation and a content of a speaker's utterance without being affected by ambient noise, by using a human image. SOLUTION: A contour shape of lips is measured for each predetermined time using an image of a human face, and its characteristics are quantified as finite Fourier descriptors by the Fourier descriptor method. After the values are recorded, a starting point and an ending point of the pronunciation of the human are determined by the change in the values using the predetermined data of the Fourier descriptors, and an approximate polynomial expression of degree n is worked out by the regression calculation, using the data of a plurality of Fourier descriptors including the predetermined Fourier descriptors between the starting point and the ending point. For each of the Fourier descriptors, n+1 coefficients are acquired, and each of the coefficients and the standard value of each of the coefficients which have been determined for each vowel and consonant beforehand, are compared using a linear discriminant function to estimate a character uttered by the human. COPYRIGHT: (C)2008,JPO&INPIT
|