摘要 |
Provided is a technology for generating rules that allow highly-natural speech synthesis without unnecessarily collecting large amount of learning data. This speech-synthesis system includes: a learning database that stores learning data, said learning data being a set of feature quantities extracted from speech waveform data; a feature-quantity-space partitioning means that partitions a feature-quantity space, which is a space related to the learning data, into subspaces; a density-detection means that detects the density of each of the subspaces into which the feature-quantity space has been partitioned and generates and outputs density information indicating said densities; and a rule-generation means that, on the basis of the outputted density information, generates speech-synthesis rules for generating pronunciation information used in speech synthesis. |