<p>A method and related system of encoding audio is disclosed. In the method, data representing a plurality of independent audio signals is accessed. The data representing each respective audio signal comprises a sequence of source frames. Each frame in the sequence of sources frames comprises a plurality of audio data copies. Each audio data copy has an associated quality level that is a member of a predefined range of quality levels, ranging from a highest quality level to a lowest quality level. The plurality of source frame sequences is merged into a sequence of target frames that comprise a plurality of target channels. Merging corresponding source frames into a respective target frame includes selecting a quality level and assigning the audio data copy at the selected quality level of each corresponding source frame to at least one respective target channel.</p>