发明名称 FAST SPEAKER RECOGNITION SCORING USING I-VECTOR POSTERIORS AND PROBABILISTIC LINEAR DISCRIMINANT ANALYSIS
摘要 A method for performing speaker recognition comprises: estimating respective uncertainties of acoustic coverage of respective speech utterance(s) by first and second speakers, the acoustic coverage representing respective sounds used by the speakers when speaking; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient memory usage by discarding dependencies between uncertainties of different sounds for the speakers; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient computation by representing an inverse of the respective uncertainties of acoustic coverage and then discarding the dependencies between the uncertainties of different sounds for the speakers; and computing a score between the speech utterance(s) by the speakers in a manner that leverages the respective uncertainties of the acoustic coverage during the comparison, the score being indicative of a likelihood that the speakers are the same speaker.
申请公布号 US2016042739(A1) 申请公布日期 2016.02.11
申请号 US201414454169 申请日期 2014.08.07
申请人 Nuance Communications, Inc. 发明人 Cumani Sandro;Vair Claudio;Colibro Daniele Ernesto;Laface Pietro;Farrell Kevin R.
分类号 G10L17/06 主分类号 G10L17/06
代理机构 代理人
主权项 1. A method of speaker recognition, the method comprising: estimating respective uncertainties of acoustic coverage of at least one speech utterance by a first speaker and at least one speech utterance by a second speaker, the acoustic coverage representing respective sounds used by the first speaker and by the second speaker when speaking; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient memory usage by discarding dependencies between uncertainties of different sounds for the first speaker and for the second speaker; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient computation by representing an inverse of the respective uncertainties of acoustic coverage and then discarding the dependencies between the uncertainties of different sounds for the first speaker and for the second speaker; and computing a score between the at least one speech utterance by the first speaker and the at least one speech utterance by the second speaker in a manner that leverages the respective uncertainties of the acoustic coverage during the comparison, the score being indicative of a likelihood that the first speaker and the second speaker are the same speaker.
地址 Burlington MA US