摘要 |
A voice analysis method and device are provided whereby processing can be performed in real time and an indefinitely large number of speakers can be coped with. A computer-executable method of voice analysis is for detecting boundaries of phonemes from input voice, and is characterized by repeating a step of specifying a time-point in an input voice signal, a step of extracting the voice signal contained in a time range of prescribed length from this time-point, and a step of decomposing the extracted voice signal into frequency component data; finding a plurality of frequency component data from the voice signal contained in time ranges of the prescribed length; finding a plurality of correlations using the frequency component data corresponding to the voice signal contained in mutually adjacent time ranges of the prescribed length; finding time ranges having a degree of change that is larger than the two adjacent degrees of change on either side thereof; and partitioning the input voice signal into a plurality of sections based on these time ranges.
|