发明名称 Automatic methods to predict error rates and detect performance degradation
摘要 An automatic speech recognition dictation application is described that includes a dictation module for performing automatic speech recognition in a dictation session with a speaker user to determine representative text corresponding to input speech from the speaker user. A post-processing module develops a session level metric correlated to verbatim recognition error rate of the dictation session, and determines if recognition performance degraded during the dictation session based on a comparison of the session metric to a baseline metric.
申请公布号 US9269349(B2) 申请公布日期 2016.02.23
申请号 US201213479945 申请日期 2012.05.24
申请人 Nuance Communications, Inc. 发明人 Xiao Xiaoqiang;Nagesha Venkatesh
分类号 G10L15/06;G10L15/01;G10L15/065 主分类号 G10L15/06
代理机构 Banner & Witcoff, Ltd. 代理人 Banner & Witcoff, Ltd.
主权项 1. A computer-implemented method comprising: generating, by a computing system and utilizing a set of speech-recognition models, text corresponding to input speech spoken by a user during a first dictation session; determining, by the computing system, based on the text corresponding to the input speech spoken by the user during the first dictation session, and without comparing the text corresponding to the input speech spoken by the user during the first dictation session to preexisting text corresponding to the input speech spoken by the user during the first dictation session, a metric correlated to a verbatim-recognition error rate of the first dictation session; generating, by the computing system and utilizing an updated set of speech-recognition models, text corresponding to input speech spoken by the user during a second dictation session; determining, by the computing system, based on the text corresponding to the input speech spoken by the user during the second dictation session, and without comparing the text corresponding to the input speech spoken by the user during the second dictation session to preexisting text corresponding to the input speech spoken by the user during the second dictation session, a metric correlated to a verbatim-recognition error rate of the second dictation session; and comparing, by the computing system, the metric correlated to the verbatim-recognition error rate of the first dictation session with the metric correlated to the verbatim-recognition error rate of the second dictation session to determine if recognition performance degraded for the user during the second dictation session due to utilization of the updated set of speech-recognition models.
地址 Burlington MA US