METHOD AND DEVICE FOR VOICE RECOGNITION USING INTEGRATED AUDIO-VISUAL
摘要
PURPOSE: An audio-video fusion voice recognition method and an apparatus for giving weight to audio word estimation probability and video word estimation probability are provided to secure a higher voice recognition rate by selectively fusing audio and video by using a signal to noise ratio. CONSTITUTION: An audio-video fusion voice recognition apparatus extracts audio characteristic information from an audio signal(S205). The apparatus calculates an audio word estimation probability from the audio characteristic information(S210). The apparatus extracts image characteristic information from a video signal(S230). The apparatus calculates a video word estimation probability from the video characteristic information(S235). The apparatus calculates an SNR(Signal to Noise Ratio) from the audio characteristic information(S215). The apparatus calculates an integrated estimation probability(S240). [Reference numerals] (AA) Voice command; (S200) Inputting a voice signal; (S205) Extracting audio characteristic information; (S210) Calculating an audio word estimation probability; (S215) Extracting a signal to noise ratio(SNR); (S220) Setting a weighted value; (S225) Inputting an image signal; (S230) Extracting image characteristic information; (S235) Calculating an image word estimation probability; (S240) Calculating an integrated estimation probability; (S245) Voice recognition