发明名称 |
DATA SHREDDING FOR SPEECH RECOGNITION LANGUAGE MODEL TRAINING UNDER DATA RETENTION RESTRICTIONS |
摘要 |
Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of a language model which includes producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state. The method further includes enabling a system to train a language model using the segments of text in the depersonalized state and the counts. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits. |
申请公布号 |
US2014278425(A1) |
申请公布日期 |
2014.09.18 |
申请号 |
US201313800738 |
申请日期 |
2013.03.13 |
申请人 |
NUANCE COMMUNICATIONS, INC. |
发明人 |
Jost Uwe Helmut;Woodland Philip Charles;Katz Marcel;Shahid Syed Raza;Vozila Paul J.;Ganong, III William F. |
分类号 |
G10L15/06 |
主分类号 |
G10L15/06 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method of enabling training of a language model, the method comprising:
producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state; and enabling a system to train a language model using the segments of text in the depersonalized state and the counts. |
地址 |
Burlington MA US |