发明名称 SYSTEM AND METHOD FOR HANDLING REPEAT QUERIES DUE TO WRONG ASR OUTPUT BY MODIFYING AN ACOUSTIC, A LANGUAGE AND A SEMANTIC MODEL
摘要 Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and/or a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.
申请公布号 US2015194150(A1) 申请公布日期 2015.07.09
申请号 US201514666548 申请日期 2015.03.24
申请人 AT&T Intellectual Property I, L.P. 发明人 LJOLJE Andrej;CASEIRO Diamantino Antonio
分类号 G10L15/18 主分类号 G10L15/18
代理机构 代理人
主权项 1. A method comprising: identifying, based on past interactions with a user, an adaptation schema which, when applied to a speech recognition model, increases a likelihood the speech recognition model will recognize misrecognized speech from the user relative to an unadapted speech recognition model; determining, via a processor, that the user has previously repeated speech inputs based on interactions with the user prior to initiating the dialog, to yield a determination; and when the determination indicates that the user has previously repeated speech inputs, adapting a speech recognition model using the adaptation schema before an expected repeat speech input, wherein adapting the speech recognition model further comprises modifying two of an acoustic model, a language model, and a semantic model.
地址 Atlanta GA US