摘要 |
The loudness, spectral mean and spectral spread of speech signals are represented in the visual domain similar to brightness, hue and saturation of a color, respectively. The above parameters of a speech signal are extracted, and, by various operations, adapted for use in, and/or with other systems. As in color, the values of these parameters are defined relative to reference frames such that the parameters so extracted are to a large degree insensitive to extraneous ambient noises, speaker differences and overall (wideband) filterings. |