发明名称 Methods and apparatus for automatically synchronizing electronic audio files with electronic text files
摘要 Automated methods and apparatus for synchronizing audio and text data, e.g., in the form of electronic files, representing audio and text expressions of the same work or information are described. A statistical language model is generated from the text data. A speech recognition operation is then performed on the audio data using the generated language model and a speaker independent acoustic model. Silence is modeled as a word which can be recognized. The speech recognition operation produces a time indexed set of recognized words some of which may be silence. The recognized words are globally aligned with the words in the text data. Recognized periods of silence, which correspond to expected periods of silence, and are adjoined by one or more correctly recognized words are identified as points where the text and audio files should be synchronized, e.g., by the insertion of bi-directional pointers. In one embodiment, for a text location to be identified for synchronization purposes, both words which bracket, e.g., precede and follow, the recognized silence must be correctly identified. Pointers, corresponding to identified locations of silence to be used for synchronization purposes are inserted into the text and/or audio files at the identified locations. Audio time stamps obtained from the speech recognition operation may be used as the bi-directional pointers. Synchronized text and audio data may be output in a variety of file formats.
申请公布号 US6260011(B1) 申请公布日期 2001.07.10
申请号 US20000531054 申请日期 2000.03.20
申请人 MICROSOFT CORPORATION 发明人 HECKERMAN DAVID E.;ALLEVA FILENO A.;ROUNTHWAITE ROBERT L.;ROSEN DANIEL;HWANG MEI-YUH;YAACOVI YORAM;MANFERDELLI JOHN L.
分类号 G06F17/30;G09B5/06;G10L15/08;G10L15/26;(IPC1-7):G10L15/08;G10L11/06 主分类号 G06F17/30
代理机构 代理人
主权项
地址