摘要 |
The invention makes use of a database of diphones derived from natural speech. A text is rendered as a series of target diphones and for each of these a number of predetermined diphone features are identified. Potential matches from the database are identified and a target cost for each of these features is established. The target costs are modified before selecting a least-cost combination. The modification of the target costs may be done by weighting, or by use of distribution functions. The calculation of the least-cost combination may be performed by a dynamic search program such as a Viterbi search. In the preferred embodiments, diphone join costs are also included in the least-cost calculation, and are also modified before the calculation is made. In addition to, or instead of , modification of target costs, the potential matches may be pre-pruned to identify a predetermined number of potential matches in descending order of suitability.
|
申请人 |
RHETORICAL GROUP PLC;TAYLOR, PAUL, ALEXANDER;AYLETT, MATTHEW, PETER;FACKRELL, JUSTIN, WYNFORD, ANDREW |
发明人 |
TAYLOR, PAUL, ALEXANDER;AYLETT, MATTHEW, PETER;FACKRELL, JUSTIN, WYNFORD, ANDREW |