发明名称 Cut and paste spoofing detection using dynamic time warping
摘要 The invention refers to a method for comparing voice utterances, the method comprising the steps: extracting a plurality of features (201) from a first voice utterance of a given text sample and extracting a plurality of features (201) from a second voice utterance of said given text sample, wherein each feature is extracted as a function of time, and wherein each feature of the second voice utterance corresponds to a feature of the first voice utterance; applying dynamic time warping (202) to one or more time dependent characteristics of the first and/or second voice utterance e.g. by minimizing one or more distance measures, wherein a distance measure is a measure for the difference of a time dependent characteristic of the first voice utterance and a corresponding time dependent characteristic of the second voice utterance, and wherein a time dependent characteristic of a voice utterance is a time dependent characteristic of either a single feature or a combination of two or more features; calculating a total distance measure (203), wherein the total distance measure is a measure for the difference between the first voice utterance of the given text sample and the second voice utterance of said given text sample, wherein the total distance measure is calculated based on one or more pairs of said time dependent characteristic, and wherein a pair of time dependent characteristic is calculate total composed of a time dependent characteristic of the first or second voice utterance and of a dynamically time warped (202) time dependent characteristic of the respectively second or first voice utterance, or wherein a pair of time dependent characteristic is composed of a dynamically time warped (202) time dependent characteristic of the first voice utterance and of a dynamically time warped (202) time dependent characteristic of the second voice utterance.
申请公布号 US9002706(B2) 申请公布日期 2015.04.07
申请号 US200913515281 申请日期 2009.12.10
申请人 Agnitio SL 发明人 Lopez Jesus Antonio Villalba;Gimenez Alfonso Ortega;Solano Eduardo Lleida;Redondo Sara Varela;Gomar Marta Garcia
分类号 G10L15/00;G10L17/00;B66B13/26;G10L17/02;G10L17/24 主分类号 G10L15/00
代理机构 Galvin Patent Law LLC 代理人 Galvin Patent Law LLC ;Galvin Brian R.
主权项 1. A method for comparing voice utterances, the method comprising the steps of: receiving, at a computer, a plurality of voice utterances of a given text sample; extracting a plurality of features from a first voice utterance of the given text sample and extracting a plurality of features from a second voice utterance of said given text sample, wherein each feature is extracted as a function of time, and wherein each feature of the second voice utterance corresponds to a feature of the first voice utterance; applying dynamic time warping to one or more time dependent characteristics of the first and/or second voice utterance by minimizing one or more distance measures, wherein a distance measure is a measure of a difference between a time dependent characteristic of the first voice utterance and a corresponding time dependent characteristic of the second voice utterance, and wherein a time dependent characteristic of a voice utterance is a time dependent characteristic of either a single feature or a combination of two or more features; and calculating a total distance measure, wherein the total distance measure is a measure for a difference between the first voice utterance of the given text sample and the second voice utterance of the given text sample, wherein the total distance measure is calculated based at least based on one or more pairs of time dependent characteristics, and wherein a pair of time dependent characteristics is composed of a time dependent characteristic of the first or second voice utterance and of a dynamically time warped time dependent characteristic of the respectively second or first voice utterance, or wherein a pair of time dependent characteristics is composed of a dynamically time warped time dependent characteristic of the first voice utterance and of a dynamically time warped time dependent characteristic of the second voice utterance; wherein the total distance measure is used to detect that the second voice utterance is a result of cut and paste spoofing; wherein the detection of cut and paste spoofing of a second voice utterance is accomplished by measuring abrupt temporal changes of feature values.
地址 Madrid ES