摘要 |
A video processing method for detecting significant events from a video program includes computing short-time sub-band energies in the audio for plural audio sub-bands, detecting scene boundaries where a weighted sum of these short-time sub-band energies are less energy threshold for longer than an time interval, segmenting the video program into a plurality of scenes by the boundaries, removing scenes shorter than a segment time interval and classifying and ranking the remaining scenes by audio. A second segmenting and removal is based upon a second energy threshold and a second time interval or when energy in a lowest frequency sub-band is greater than a predetermined bass energy threshold. The first segment time interval may be recomputed based upon the distribution of length of the remaining scenes. |