发明名称 LEARNING LANGUAGE MODELS FROM SCRATCH BASED ON CROWD-SOURCED USER TEXT INPUT
摘要 Technology is described for developing a language model for a language recognition system from scratch based on aggregating and analyzing text input from multiple users of the language. The technology allows a user to select a language, and if no existing language model is available for the selected language, provides a new language model for the selected language, monitors and collects information about the use of words in the selected language, combines information collected from multiple users of the selected language, and updates the user's language model based on the combined information from multiple users of the selected language.
申请公布号 US2015309984(A1) 申请公布日期 2015.10.29
申请号 US201414262304 申请日期 2014.04.25
申请人 Nuance Communications, Inc. 发明人 Bradford Ethan R.;Corston Simon;McCray Donni;Cross Ryan N.
分类号 G06F17/27;G06F17/28 主分类号 G06F17/27
代理机构 代理人
主权项 1. A tangible computer-readable memory having contents configured to cause at least one computer having a processor to perform a method for assisting in building a new language model used by language recognition systems, the method comprising: initializing a language model for a selected language, wherein a language recognition system that uses a language model to predict words in a language is ineffective to predict intended words in the selected language; monitoring use of words in the selected language on various computing devices by multiple users of the selected language; collecting, in substantially real-time, information about the monitored use of the words in the selected language by the multiple users of the selected language; generating updates to the language model based on the collected information about the monitored use of the words in the selected language; and providing to the various computing devices the generated updates to the language model, such that a language recognition system using the language model including the generated updates is more effective to predict intended words in the selected language.
地址 Burlington MA US