UNSUPERVISED AND ACTIVE LEARNING IN AUTOMATIC SPEECH RECOGNITION FOR CALL CLASSIFICATION,申请号US201414468375-传众专利搜索

发明名称	UNSUPERVISED AND ACTIVE LEARNING IN AUTOMATIC SPEECH RECOGNITION FOR CALL CLASSIFICATION
摘要	Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
申请公布号	US2015046159(A1)	申请公布日期	2015.02.12
申请号	US201414468375	申请日期	2014.08.26
申请人	AT&T Intellectual Property II, L.P.	发明人	Hakkani-Tur Dilek Z.;Rahim Mazin G.;Riccardi Giuseppe;Tur Gokhan
分类号	G10L15/18;G10L15/26	主分类号	G10L15/18
代理机构		代理人
主权项	1. A method comprising: performing automatic speech recognition using a bootstrap model on utterance data not having a corresponding manual transcription, to produce automatically transcribed utterances, wherein the bootstrap model is based on text data mined from a website relevant to a specific domain; selecting a predetermined number of utterances not having a corresponding manual transcription based on a geometrically computed n-tuple confidence score; receiving transcriptions of the predetermined number of utterances, wherein the transcriptions are made by a human being; and generating a language model based on the automatically transcribed utterances, the predetermined number of utterances, and the transcriptions.
地址	Atlanta GA US