发明名称 Methods and apparatus for performing speech recognition using acoustic models which are improved through an interactive process
摘要 Automated methods and apparatus for synchronizing audio and text data, e.g., in the form of electronic files, representing audio and text expressions of the same work or information are described. Also described are automated methods of detecting errors and other discrepancies between the audio and text versions of the same work. A speech recognition operation is performed on the audio data initially using a speaker independent acoustic model. The recognized text in addition to audio time stamps are produced by the speech recognition operation. The recognized text is compared to the text in text data to identify correctly recognized words. The acoustic model is then retrained using the correctly recognized text and corresponding audio segments from the audio data transforming the initial acoustic model into a speaker trained acoustic model. The retrained acoustic model is then used to perform an additional speech recognition operation on the audio data. The audio and text data are synchronized using the results of the updated acoustic model. In addition, one or more error reports based on the final recognition results are generated showing discrepancies between the recognized words and the words included in the text. By retraining the acoustic model in the above described manner, improved accuracy is achieved.
申请公布号 US6263308(B1) 申请公布日期 2001.07.17
申请号 US20000531055 申请日期 2000.03.20
申请人 MICROSOFT CORPORATION 发明人 HECKERMAN DAVID E.;ALLEVA FILENO A.;ROUNTHWAITE ROBERT L.;ROSEN DANIEL;HWANG MEI-YUH;YAACOVI YORAM;MANFERDELLI JOHN L.
分类号 G10L15/06;(IPC1-7):G10L15/02;G10L11/06 主分类号 G10L15/06
代理机构 代理人
主权项
地址