摘要 |
PROBLEM TO BE SOLVED: To segment a voice signal including voice of a plurality of speakers into sections for each speaker. SOLUTION: A voice segmentation section 12 specifies an envelope E of a waveform of the voice signal S including voice of the plurality of speakers, and detects a plurality of valleys D in the envelope E. The valley D is a boundary between a section where a level of the envelope E continuously decreases for a prescribed time period, and a section where the level of the envelope E continuously increases for a prescribed time period. The voice segmentation section 12 segments the voice signal S into the plurality of sections B by setting each valley D as the boundary. Moreover, the voice segmentation section 12 specifies a peak value Lp for a plurality of peaks P of the envelope E, and determines that the section B including the peak P where the peak value Lp is lower than a threshold TH in the plurality of sections B, is a silent section. COPYRIGHT: (C)2009,JPO&INPIT
|