摘要 |
The present document relates an audio encoding and decoding system (referred to as an audio codec system). In particular, the present document relates to a transform-based audio codec system which is particularly well suited for voice encoding/decoding. A transform-based speech encoder (100, 170) configured to encode a speech signal into a bitstream is described. The encoder (100, 170) comprises a framing unit (101) configured to receive a set (132, 332) of blocks; wherein the set (132, 332) of blocks comprises a plurality of sequential blocks (131) of transform coefficients; wherein the plurality of blocks (131) is indicative of samples of the speech signal; wherein a block (131) of transform coefficients comprises a plurality of transform coefficients for a corresponding plurality of frequency bins (301). Furthermore, the encoder (100, 170) comprises an envelope estimation unit (102) configured to determine a current envelope (133) based on the plurality of sequential blocks (131) of transform coefficients; wherein the current envelope (133) is indicative of a plurality of spectral energy values (303) for the corresponding plurality of frequency bins (301). In addition, the encoder (100, 170) comprises an envelope interpolation unit (104) configured to determine a plurality of interpolated envelopes (136) for the plurality of blocks (131) of transform coefficients, respectively, based on the current envelope (133); Furthermore, the encoder (100, 170) comprises a flattening unit (108) configured to determine a plurality of blocks (140) of flattened transform coefficients by flattening the corresponding plurality of blocks (131) of transform coefficients using the corresponding plurality of interpolated envelopes (136), respectively; wherein the bitstream is determined based on the plurality of blocks (140) of flattened transform coefficients. |