摘要 |
A system for browsing and editing video, in accordance with the present invention, includes a video source for providing a video document which includes audio information, and an audio classifier coupled to the video source, the audio classifier being adapted to classify audio segments of the audio information into a plurality of classes. An audio spectrogram generator is coupled to the video source for generating spectrograms for the audio information to check that the audio segments have been identified correctly by the audio classifier. A browser is coupled to the audio classifier for searching the classified audio segments for editing and browsing the video document. A method for editing and browsing a video, in accordance with the invention, includes providing a video clip including audio, and segmenting and labeling the audio into music, silence and speech classes in real-time. The step of determining pitch for the speech class to identify and check changes in speakers and browsing the changes in speaker and the audio labels to associate the changes in speaker and the audio labels with frames of the video clip are also included.
|