发明名称 SYSTEMS AND METHODS FOR TEXTUAL CONTENT CREATION FROM SOURCES OF AUDIO THAT CONTAIN SPEECH
摘要 A system and method of creating textual content from audio streams is present. The system can include a computing device configured to receive audio streams containing speech and identify the different speakers in the speech. The system breaks apart an audio stream into separate audio streams using speaker diarization and each audio stream is sent separately to a speech-to-text transcriber. Each audio stream includes only the speech of a single speaker, which is more easily converted into text by the speech-to-text transcriber. The text streams can be assembled into a transcript of the speech portions of the audio stream. A web page of the transcript can be published. High frequency words in the transcript can be tagged in the metadata of the web page to assist search engines and increase the value of the web page.
申请公布号 US2016179831(A1) 申请公布日期 2016.06.23
申请号 US201414891221 申请日期 2014.07.14
申请人 VOCAVU SOLUTIONS LTD. 发明人 Gruber Zeev;Turner Ziv;Atias Nissim;Polityko Eduard
分类号 G06F17/30;G06F17/28;G10L15/26;G10L19/018;G10L17/00 主分类号 G06F17/30
代理机构 代理人
主权项 1. A textual content creation system, comprising: a computing device configured to receive an audio stream that includes speech from at least a first speaker and a second speaker,identify a first portion of the audio stream having speech from the first speaker as a first portion of the audio stream with speaker diarization, and a second portion of the audio stream having speech from the second speaker as a second portion of the audio stream with speaker diarization,send each of the portions of the audio stream with speaker diarization to a speech-to-text transcriber separate from the other portion of the audio stream with speaker diarization, each portion of the audio stream with speaker diarization consisting essentially of a portion of the audio stream that includes speech identified with exactly one of the first speaker or the second speaker,receive one or more text streams, each text stream consisting essentially of a transcribed text of the speech of an associated portion of the audio stream with speaker diarization, andassemble, from one or more transcribed texts, an ordered transcript of the speech of the audio stream.
地址 Ramat Gan IL