发明名称 Method and system for processing parallel context dependent speech recognition results from a single utterance utilizing a context database
摘要 A method of and system for accurately determining a caller response by processing speech-recognition results and returning that result to a directed-dialog application for further interaction with the caller. Multiple speech-recognition engines are provided that process the caller response in parallel. Returned speech-recognition results comprising confidence-score values and word-score values from each of the speech-recognition engines may be modified based on context information provided by the directed-dialog application and grammars associated with each speech-recognition engine. A context database is used to further reduce or add weight to confidence-score values and word-score values, remove phrases and/or words, and add phrases and/or words to the speech-recognition engine results. In situations where a predefined threshold-confidence-score value is not exceeded, a new dynamic grammar may be created. A set of n-best hypotheses of what the caller uttered is returned to the directed-dialog application.
申请公布号 US9117453(B2) 申请公布日期 2015.08.25
申请号 US201012982146 申请日期 2010.12.30
申请人 Volt Delta Resources, LLC 发明人 Bielby Gregory J.
分类号 G10L15/00;G10L17/00;G10L21/00;G10L15/32;G10L15/22;G06F17/28;G10L17/02;G10L15/26 主分类号 G10L15/00
代理机构 Winstead PC 代理人 Winstead PC
主权项 1. A system comprising: a directed-dialog-processor server having a directed-dialog-processor application executing thereon; a speech-recognition-engine server having a plurality of parallel-operable speech-recognition-engine applications executing thereon; wherein the plurality of parallel-operable speech-recognition-engine applications each provide a different speech-recognition capability; a context database; a multiple-recognition-processor server in data communication with the directed-dialog-processor server, the speech-recognition-engine server, and the context database and having a multiple-recognition-processor application executing thereon; and wherein the multiple-recognition-processor server is operable, via the multiple-recognition-processor application, to: receive context information and a forwarded caller response from the directed-dialog-processor application; select, using the context information, a set of parallel-operable speech-recognition-engine applications from the plurality of parallel-operable speech-recognition-engine applications; combine the context information with additional context information from the context database to form modified context information; forward to each speech-recognition-engine application in the selected set the modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response; receive from each speech-recognition-engine application in the selected set an n-best list comprising at least one confidence-score value and at least one word-score value; wherein the at least one confidence-score value and the at least one word-score value in each n-best list are modified by a weight-multiplier value based on the context information provided by the directed-dialog-processor application, thereby creating a modified n-best list; wherein each modified n-best list is combined into a single, sorted combined n-best list; and wherein the at least one confidence-score value and the at least one word-score value of the sorted combined n-best list are modified by determining presence of phrases and words of the sorted combined n-best list in the context database.
地址 Orange CA US