摘要 |
PURPOSE: To improve time resolution by extracting an acoustic parameter from respective input voices near to a 1st phoneme boundary candidate and phoneme boundary sub-candidates, mutually comparing respective acoustic parameter values and selecting a phoneme boundary. CONSTITUTION: The 1st phoneme boundary candidate 21 is extracted based upon an acoustic parameter value extracted in each frame length of a certain fixed time. In the vicinity of the candidate 21, the peaks of an input sound waveform are detected, only peaks corresponding to the periodical vibration of vocal chords out of the detected peaks are extracted as phoneme boundary sub-candidates. A range having the possibility of existence of a practical phoneme boundary 23 is considered as frames before and after the candidate 21. Thereby the range for selecting phoneme boundary sub-candidates can be restricted to the frames before and after the candidate 21. Since a boundary can be pitch-synchronously determined, accurate segmentation having high time resolution can be attained. |