发明名称 |
I-Vector Based Clustering Training Data in Speech Recognition |
摘要 |
Methods and systems for i-vector based clustering training data in speech recognition are described. An i-vector may be extracted from a speech segment of a speech training data to represent acoustic information. The extracted i-vectors from the speech training data may be clustered into multiple clusters using a hierarchical divisive clustering algorithm. Using a cluster of the multiple clusters, an acoustic model may be trained. This trained acoustic model may be used in speech recognition. |
申请公布号 |
US2015199960(A1) |
申请公布日期 |
2015.07.16 |
申请号 |
US201213640804 |
申请日期 |
2012.08.24 |
申请人 |
Microsoft Corporation |
发明人 |
Huo Qiang;Yan Zhi-Jie;Zhang Yu;Xu Jian |
分类号 |
G10L15/06 |
主分类号 |
G10L15/06 |
代理机构 |
|
代理人 |
|
主权项 |
1. A computer-implemented method for clustering training data in speech recognition, the method comprising:
extracting a plurality of i-vectors from speech data including a plurality of speech segments; clustering the plurality of i-vectors into a plurality of clusters; training an acoustic model using one of the plurality of clusters; and recognizing one or more other speech segments using the trained acoustic model. |
地址 |
Redmond WA US |