发明名称 Speaker verification
摘要 A speaker verification method is proposed that first builds a general model of user utterances using a set of general training speech data. The user also trains the system by providing a training utterance, such as a passphrase or other spoken utterance. Then in a test phase, the user provides a test utterance which includes some background noise as well as a test voice sample. The background noise is used to bring the condition of the training data closer to that of the test voice sample by modifying the training data and a reduced set of the general data, before creating adapted training and general models. Match scores are generated based on the comparison between the adapted models and the test voice sample, with a final match score calculated based on the difference between the match scores. This final match score gives a measure of the degree of matching between the test voice sample and the training utterance and is based on the degree of matching between the speech characteristics from extracted feature vectors that make up the respective speech signals, and is not a direct comparison of the raw signals themselves. Thus, the method can be used to verify a speaker without necessarily requiring the speaker to provide an identical test phrase to the phrase provided in the training sample.
申请公布号 US9343067(B2) 申请公布日期 2016.05.17
申请号 US200913126859 申请日期 2009.10.29
申请人 BRITISH TELECOMMUNICATIONS public limited company 发明人 Ariyaeeinia Aladdin M;Pillay Surosh G;Pawlewski Mark
分类号 G10L15/20;G10L17/12;G10L17/20 主分类号 G10L15/20
代理机构 Nixon & Vanderhye P.C. 代理人 Nixon & Vanderhye P.C.
主权项 1. A method of verifying the identity of a speaker in a speaker verification system, said method comprising: i) building, using a computer system including at least one computer processor, a general speaker model using feature vectors extracted from a first set of speaker utterances taken from a large population of speakers; ii) receiving training speaker utterances provided by the speaker in a training phase, modifying the received training speaker utterances provided by the speaker in the training phase using a noise sample to obtain modified training speaker utterances, and modifying a second set of sample speaker utterances using the noise sample to obtain a modified set of background speaker utterances, wherein said second set comprises speaker utterances taken from a population of speakers that is less that in the first set; iii) generating an adapted target speaker model by using feature vectors extracted from the modified training speaker utterances to adapt the general speaker model, and generating a set of adapted background speaker models by using feature vectors extracted from the modified set of background speaker utterances to adapt the general speaker model; iv) receiving a test voice sample and calculating a target match score based on a comparison between the adapted target speaker model and the received test voice sample, and calculating a set of background match scores based on a comparison between the set of adapted background speaker models and the test voice sample; v) determining a final match score representing the degree of matching between the characteristics of the training speaker utterance and the test voice sample, wherein the final match score is dependent on the difference between the target match score and the mean background match scores; and vi) verifying the identity of the speaker in the speaker verification system based on the final match score.
地址 London GB
您可能感兴趣的专利