摘要 |
<p><P>PROBLEM TO BE SOLVED: To realize an efficient multi-modal input by suppressing erroneous recognition in the case of inputting a setup for a plurality of items with one utterance. <P>SOLUTION: First structuring data including a candidate for a recognition result is generated by inputting a setup instruction by voice from a voice input section 101, and by recognizing and interpreting the content of the setup instruction by the voice in a voice recognition/interpretation section 103. On the other hand, the setup instruction input by a user is detected in a tap input section 102, and second structuring data is generated by interpreting the content of the setup instruction input. In an interpretation selection section 104, the candidate for interpretation including the setup item name matching the setup item name included in the second structuring data among the candidate for interpretation included in the first structuring data, is selected. <P>COPYRIGHT: (C)2007,JPO&INPIT</p> |