摘要 |
PROBLEM TO BE SOLVED: To improve recognition performance. SOLUTION: A synchronous processing section 2 synchronizes an inputted image and speech and a characteristic extraction section 3 extracts characteristic quantities respectively from the synthesized image and speech and obtains a synthesized characteristic quantity formed by synthesizing the image and speech. A learning section 7 makes learning in accordance with the synthesized characteristic quantity, forms a model dealing with the image and speech indicating the same concept and forms a dictionary which makes the model and the concept information indicating the concept of the image and the speech correspondent to each other. On the other hand, a recognition processing section 5 makes matching by using the synthesized characteristic quantity and the model in the dictionary, thereby recognizing the concept indicated by the input image and speech.
|