SPEECH RECOGNIZER WITH MULTI-DIRECTIONAL DECODING,申请号US201314039383-传众专利搜索

发明名称	SPEECH RECOGNIZER WITH MULTI-DIRECTIONAL DECODING
摘要	In an automatic speech recognition (ASR) processing system, ASR processing may be configured to process speech based on multiple channels of audio received from a beamformer. The ASR processing system may include a microphone array and the beamformer to output multiple channels of audio such that each channel isolates audio in a particular direction. The multichannel audio signals may include spoken utterances/speech from one or more speakers as well as undesired audio, such as noise from a household appliance. The ASR device may simultaneously perform speech recognition on the multi-channel audio to provide more accurate speech recognition results.
申请公布号	US2015095026(A1)	申请公布日期	2015.04.02
申请号	US201314039383	申请日期	2013.09.27
申请人	Amazon Technologies, Inc.	发明人	Bisani Michael Maximilian Emanuel;Strom Nikko;Hoffmeister Bjorn;Thomas Ryan Paul
分类号	G10L15/00;G10L15/16	主分类号	G10L15/00
代理机构		代理人
主权项	1. A method for performing speech recognition, the method comprising: receiving a multiple-channel audio signal comprising a first channel and a second channel, wherein the first channel and second channel are created using a beamformer and a microphone array, the first channel representing audio from a first direction, and the second channel representing audio from a second direction; creating a first sequence of feature vectors for the first channel and a second sequence of feature vectors for the second channel; performing speech recognition using the first sequence of feature vectors and the second sequence of feature vectors, wherein performing speech recognition comprises: generating a first hypothesis using a speech recognition model and a first feature vector of the first sequence of feature vectors;generating a second hypothesis using the speech recognition model and a second feature vector of the second sequence of feature vectors, wherein the second hypothesis is subsequent to the first hypothesis in a speech recognition result network.
地址	Reno NV US