发明名称 An integrated auto-diarization system which identifies a plurality of speakers in audio data and decodes the speech to create a transcript
摘要 The invention lies in integrated auto-diarization. A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by the users is disclosed. The method comprises receiving speech, dividing the speech into segments as it is received, processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, the processing comprising, performing primary decoding of the segment using an acoustic model and a language model, obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding, comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker and selecting a speaker profile for the speaker, updating the selected speaker profile, performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile and outputting the decoded speech for the identified speaker wherein the speaker profiles are updated as further segments of speech relating to a speaker profile are processed.
申请公布号 GB2489489(A) 申请公布日期 2012.10.03
申请号 GB20110005415 申请日期 2011.03.30
申请人 TOSHIBA RESEARCH EUROPE LIMITED 发明人 CATHERINE BRESLIN;MARK JOHN FRANCIS GALES;KEAN KHEONG CHIN;KATHERINE MARY KNILL
分类号 G10L15/26;G10L17/00 主分类号 G10L15/26
代理机构 代理人
主权项
地址