主权项 |
1. A textual content creation system, comprising:
a computing device configured to
receive an audio stream that includes speech from at least a first speaker and a second speaker,identify a first portion of the audio stream having speech from the first speaker as a first portion of the audio stream with speaker diarization, and a second portion of the audio stream having speech from the second speaker as a second portion of the audio stream with speaker diarization,send each of the portions of the audio stream with speaker diarization to a speech-to-text transcriber separate from the other portion of the audio stream with speaker diarization, each portion of the audio stream with speaker diarization consisting essentially of a portion of the audio stream that includes speech identified with exactly one of the first speaker or the second speaker,receive one or more text streams, each text stream consisting essentially of a transcribed text of the speech of an associated portion of the audio stream with speaker diarization, andassemble, from one or more transcribed texts, an ordered transcript of the speech of the audio stream. |