发明名称 Speaker recognition using local models
摘要 A system and method for voice recognition is disclosed. The system enrolls speakers using an enrollment voice samples and identification information. An extraction module characterizes enrollment voice samples with high-dimensional feature vectors or speaker data points. A data structuring module organizes data points into a high-dimensional data structure, such as a kd-tree, in which similarity between data points dictates a distance, such as a Euclidean distance, a Minkowski distance, or a Manhattan distance. The system recognizes a speaker using an unidentified voice sample. A data querying module searches the data structure to generate a subset of approximate nearest neighbors based on an extracted high-dimensional feature vector. A data modeling module uses Parzen windows to estimate a probability density function representing how closely characteristics of the unidentified speaker match enrolled speakers, in real-time, without extensive training data or parametric assumptions about data distribution. A smoothing parameter controls the relative contributions of close and far speaker data points to the estimated density.
申请公布号 US2004225498(A1) 申请公布日期 2004.11.11
申请号 US20040810232 申请日期 2004.03.26
申请人 RIFKIN RYAN 发明人 RIFKIN RYAN
分类号 G10L17/00;(IPC1-7):G10L15/00 主分类号 G10L17/00
代理机构 代理人
主权项
地址