发明名称 Mixed speech recognition
摘要 The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.
申请公布号 US9390712(B2) 申请公布日期 2016.07.12
申请号 US201414223468 申请日期 2014.03.24
申请人 Microsoft Technology Licensing, LLC. 发明人 Yu Dong;Weng Chao;Seltzer Michael L.;Droppo James
分类号 G10L15/16;G10L15/06;G10L15/20 主分类号 G10L15/16
代理机构 代理人 Corie Alin;Swain Sandy;Minhas Micky
主权项 1. A method performed by a computer processor for recognizing mixed speech from a source, comprising: training a first neural network to recognize a speech signal spoken by a speaker with a higher level of a speech characteristic from a mixed speech sample; training a second neural network to recognize a speech signal spoken by a speaker with a lower level of the speech characteristic from the mixed speech sample, wherein the lower level is lower than the higher level; and decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals.
地址 Redmond WA US