发明名称 Pronunciation learning from user correction
摘要 Systems and methods are described for adding entries to a custom lexicon used by a speech recognition engine of a speech interface in response to user interaction with the speech interface. In one embodiment, a speech signal is obtained when the user speaks a name of a particular item to be selected from among a finite set of items. If a phonetic description of the speech signal is not recognized by the speech recognition engine, then the user is presented with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item. After the user has selected the particular item via the means for selecting, the phonetic description of the speech signal is stored in association with a text description of the particular item in the custom lexicon.
申请公布号 US9640175(B2) 申请公布日期 2017.05.02
申请号 US201113268281 申请日期 2011.10.07
申请人 Microsoft Technology Licensing, LLC 发明人 Liu Wei-Ting Frank;Lovitt Andrew;Tomko Stefanie;Ju Yun-Cheng
分类号 G06F17/21;G10L15/06;G10L21/00;G10L15/00;G09G5/00;H04M1/64;H04M3/00;G08G1/123;G09B19/06;A61B8/00;G10L15/22 主分类号 G06F17/21
代理机构 Fiala & Weaver P.L.L.C. 代理人 Fiala & Weaver P.L.L.C.
主权项 1. A method for updating a custom lexicon used by a speech recognition engine that comprises part of a speech interface, comprising: obtaining a speech signal by the speech interface when a user speaks a name of a particular item for the purpose of selecting the particular item from among a finite set of items; presenting the user with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item in response to determining that a phonetic description of the speech signal is not recognized by the speech recognition engine; after the user has selected the particular item via the means for selecting, storing the phonetic description of the speech signal in association with a text description of the particular item in the custom lexicon, the custom lexicon comprising a user-specific custom lexicon that is used to recognize speech of the user only and a system custom lexicon that is used to recognize speech of all users of a system, the storing comprising: determining if the particular item is of a particular type, automatically storing the phonetic description of the speech signal only in the user-specific custom lexicon in response to determining that the particular item is of the particular type, andautomatically storing the phonetic description of the speech signal only in the system custom lexicon in response to determining that the particular item is not of the particular type; and elevating the phonetic description of the speech signal stored in the user-specific custom lexicon to the system custom lexicon in response to determining that a certain number of user-specific custom lexicons all include a same or similar pronunciation for the particular item.
地址 Redmond WA US