摘要 |
A speech synthesis system in which coefficients of a speech synthesis filter are quantized. An LSP or other filter coefficient representation which evolves slowly with time is generated for each of a series of N input speech frames to produce p coefficients in respect of each frame. The coefficients related to the N frames define a pxN matrix, with each row of the matrix containing N coefficients and each coefficient of one row being related to a respective one of the N frames. The matrix is split into a series of submatrices each made up from one or more of the rows, and each submatrix is vector quantized independently of the other submatrices using a composite time/spectral weighting function which for example emphasises distortion associated with high energy regions of the spectrum of each of the N input speech frames and is also proportional to the energy and degree of voicing of each of the N input speech frames. A codebook index is produced which is transmitted and used at the receiver to address a receiver codebook.
|