摘要 |
Perceptual coding of spatial cues (PCSC) is used to convert two or more input audio signals into a combined audio signal that is embedded with two or more sets of one or more auditory scene parameters, where each set of auditory scene parameters (e.g., one or more spatial cues such as an inter-ear level difference (ILD), inter-ear time difference (ITD), and/or head-related transfer function (HRTF)) corresponds to a different frequency band in the combined audio signal. A PCSC-based receiver is able to extract the auditory scene parameters and apply them to the corresponding frequency bands of the combined audio signal to synthesize an auditory scene. The technique used to embed the auditory scene parameters into the combined signal enables a legacy receiver that is unaware of the embedded auditory scene parameters to play back the combined audio signal in a conventional manner, thereby providing backwards compatibility. In one embodiment, two or more input signals are used to generate a mono audio signal with embedded spatial cues. A PCSC-based receiver can extract and apply the spatial cues to generate two (or more) output audio channels, while a legacy receiver is able to play back the mono audio signal in a conventional (i.e., mono) manner. The backwards compatibility feature can be combined with a layered coding technique and/or a multi-descriptive coding technique to improve error protection when the embedded audio signal is transmitted over one or more lossy channels.
|