发明名称 SPEECH SIGNAL SEPARATION AND SYNTHESIS BASED ON AUDITORY SCENE ANALYSIS AND SPEECH MODELING
摘要 Provided are systems and methods for generating clean speech from a speech signal representing a mixture of a noise and speech. The clean speech may be generated from synthetic speech parameters. The synthetic speech parameters are derived based on the speech signal components and a model of speech using auditory and speech production principles. The modeling may utilize a source-filter structure of the speech signal. One or more spectral analyses on the speech signal are performed to generate spectral representations. The feature data is derived based on a spectral representation. The features corresponding to the target speech according to a model of speech are grouped and separated from the feature data. The synthetic speech parameters, including spectral envelope, pitch data and voice classification data are generated based on features corresponding to the target speech.
申请公布号 US2015025881(A1) 申请公布日期 2015.01.22
申请号 US201414335850 申请日期 2014.07.18
申请人 Audience, Inc. 发明人 Carlos Avendano;Klein David;Woodruff John;Goodwin Michael M.
分类号 G10L15/20 主分类号 G10L15/20
代理机构 代理人
主权项 1. A method for generating clean speech from a mixture of noise and speech, the method comprising: deriving, based on the mixture of noise and speech and a model of speech, speech parameters, the deriving using at least one hardware processor; and synthesizing, based at least partially on the speech parameters, clean speech.
地址 Mountain View CA US