发明名称 Speech processing system and method
摘要 A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are updated as further segments of speech relating to a speaker profile are processed.
申请公布号 US8612224(B2) 申请公布日期 2013.12.17
申请号 US201113215711 申请日期 2011.08.23
申请人 BRESLIN CATHERINE;GALES MARK JOHN FRANCIS;CHIN KEAN KHEONG;KNILL KATHERINE MARY;KABUSHIKI KAISHA TOSHIBA 发明人 BRESLIN CATHERINE;GALES MARK JOHN FRANCIS;CHIN KEAN KHEONG;KNILL KATHERINE MARY
分类号 G10L15/06;G10L17/00 主分类号 G10L15/06
代理机构 代理人
主权项
地址