摘要 |
An information processing apparatus for speech recognition includes a first speech dataset storing speech data uttered by low recognition rate speakers; a second speech dataset storing speech data uttered by a plurality of speakers; a third speech dataset storing speech data to be mixed with the speech data of the second speech dataset; a similarity calculating part obtaining, for each piece of the speech data in the second speech dataset, a degree of similarity to a given average voice in the first speech dataset; a speech data selecting part recording the speech data, the degree of similarity of which is within a given selection range, as selected speech data in the third speech dataset; and an acoustic model generating part generating a first acoustic model using the speech data recorded in the second speech dataset and the third speech dataset. |