摘要 |
<P>PROBLEM TO BE SOLVED: To provide a spoken word analyzer that extracts a section of prominence from voice data without using learning data. <P>SOLUTION: A spoken word analyzer synthesizes a synthesized voice in a reading tone that has a language label, by regarding a prominence section extracted voice, in which a prominence section applied with the language label is extracted, as an input. Then, the analyzer extracts a fundamental frequency sequence X1 that is a fundamental frequency sequence of the prominence section extracted voice and a fundamental frequency sequence X2 that is a fundamental frequency sequence of the synthesized voice, with the prominence section extracted voice and the synthesized voice as inputs. A prominence section extracting unit in the analyzer extracts a prominence section of the prominence section extracted voice and outputs prominence section information, on the basis of a correlation between the fundamental frequency sequence X1 and the fundamental frequency sequence X2 regarding a direction of variation in fundamental frequency between accent phrases and a comparison between the fundamental frequency sequence X1 and the fundamental frequency sequence X2 regarding a quantity of variation in fundamental frequency between the accent phrases, with the fundamental frequency sequences X1, X2, and the language label as inputs. <P>COPYRIGHT: (C)2013,JPO&INPIT |