发明名称 |
Method and system for processing parallel context dependent speech recognition results from a single utterance utilizing a context database |
摘要 |
A method of and system for accurately determining a caller response by processing speech-recognition results and returning that result to a directed-dialog application for further interaction with the caller. Multiple speech-recognition engines are provided that process the caller response in parallel. Returned speech-recognition results comprising confidence-score values and word-score values from each of the speech-recognition engines may be modified based on context information provided by the directed-dialog application and grammars associated with each speech-recognition engine. A context database is used to further reduce or add weight to confidence-score values and word-score values, remove phrases and/or words, and add phrases and/or words to the speech-recognition engine results. In situations where a predefined threshold-confidence-score value is not exceeded, a new dynamic grammar may be created. A set of n-best hypotheses of what the caller uttered is returned to the directed-dialog application. |
申请公布号 |
US9117453(B2) |
申请公布日期 |
2015.08.25 |
申请号 |
US201012982146 |
申请日期 |
2010.12.30 |
申请人 |
Volt Delta Resources, LLC |
发明人 |
Bielby Gregory J. |
分类号 |
G10L15/00;G10L17/00;G10L21/00;G10L15/32;G10L15/22;G06F17/28;G10L17/02;G10L15/26 |
主分类号 |
G10L15/00 |
代理机构 |
Winstead PC |
代理人 |
Winstead PC |
主权项 |
1. A system comprising:
a directed-dialog-processor server having a directed-dialog-processor application executing thereon; a speech-recognition-engine server having a plurality of parallel-operable speech-recognition-engine applications executing thereon; wherein the plurality of parallel-operable speech-recognition-engine applications each provide a different speech-recognition capability; a context database; a multiple-recognition-processor server in data communication with the directed-dialog-processor server, the speech-recognition-engine server, and the context database and having a multiple-recognition-processor application executing thereon; and wherein the multiple-recognition-processor server is operable, via the multiple-recognition-processor application, to: receive context information and a forwarded caller response from the directed-dialog-processor application; select, using the context information, a set of parallel-operable speech-recognition-engine applications from the plurality of parallel-operable speech-recognition-engine applications; combine the context information with additional context information from the context database to form modified context information; forward to each speech-recognition-engine application in the selected set the modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response; receive from each speech-recognition-engine application in the selected set an n-best list comprising at least one confidence-score value and at least one word-score value; wherein the at least one confidence-score value and the at least one word-score value in each n-best list are modified by a weight-multiplier value based on the context information provided by the directed-dialog-processor application, thereby creating a modified n-best list; wherein each modified n-best list is combined into a single, sorted combined n-best list; and wherein the at least one confidence-score value and the at least one word-score value of the sorted combined n-best list are modified by determining presence of phrases and words of the sorted combined n-best list in the context database. |
地址 |
Orange CA US |