发明名称 Speech-recognition device and speech-recognition method
摘要 With respect to speech data 4 of an input speech 2, a speech-recognition device 1 performs at an internal recognizer 7, recognition processing using an acoustic model 9, to calculate an internal recognition result 10 and its acoustic likelihood. A reading-addition processor 12 acquires an external recognition result 11 from recognition processing of the speech data 4 of the input speech 2 by an external recognizer 19 and adds a reading thereto, and a re-collation processor 15 calculates, using the acoustic model 9, the acoustic likelihood of the external recognition result 11 to provide a re-collation result 16. A result-determination processor 17 compares the acoustic likelihood of the internal recognition result 10 with the acoustic likelihood of the external recognition result 11 included in the re-collation result 16, to thereby determine a final recognition result 18.
申请公布号 US9431010(B2) 申请公布日期 2016.08.30
申请号 US201314655141 申请日期 2013.03.06
申请人 Mitsubishi Electric Corporation 发明人 Hanazawa Toshiyuki
分类号 G10L15/00;G10L15/18;G10L15/32;G06F17/27;G10L15/30 主分类号 G10L15/00
代理机构 Oblon, McClelland, Maier & Neustadt, L.L.P 代理人 Oblon, McClelland, Maier & Neustadt, L.L.P
主权项 1. A speech-recognition device which acquires an internal recognition result from its recognition processing of input speech data and an external recognition result from recognition processing of said input speech data by one or more external recognition devices external to the speech-recognition device to determine a final recognition result, the speech-recognition device comprising: memory including: an acoustic model in which feature quantities of speeches are modeled;a language model in which notations and readings of recognition-object words of the speech-recognition device are stored; anda reading dictionary in which pairs of the notations and the readings of the recognition-object words and words other than the recognition-object words are stored; and circuitry configured to: transmit said input speech data to the one or more external recognition devices;analyze the input speech data to calculate a feature vector;perform, using the acoustic model, pattern collation between the calculated feature vector and each word stored in the language model to calculate their respective acoustic likelihoods;output, as the internal recognition result, a corresponding notation, a corresponding reading, and a corresponding acoustic likelihood of top one or more high-ranking words;acquire the external recognition result from recognition processing of the input speech data by the one or more external recognition devices, extract a reading corresponding to a notation included in said external recognition result using the reading dictionary, and output a result composed of said external recognition result and the extracted reading;perform, using the acoustic model, pattern collation between the calculated feature vector and the output result to calculate an acoustic likelihood for the external recognition result; andcompare the corresponding acoustic likelihood of the internal recognition result with the acoustic likelihood of the external recognition result to determine the final recognition result.
地址 Chiyoda-ku JP