摘要 |
For video compression processing, each frame in a video sequence is segmented into one or more different regions, where the macroblocks of each region are to be encoded using the same quantizer value, but the quantizer value can vary between regions in a frame. For example, for the videophone or video-conferencing paradigm of one or more "talking heads" in front of a relatively static background, each frame is segmented into a foreground region corresponding to the talking head, a background region corresponding to the static background, and an intervening transition region. An encoding complexity measure is generated for each macroblock of the previous frame using a (e.g., first-order) rate distortion model and the resulting macroblock-level encoding complexities are used to generate an average encoding complexity for each region. These region complexities are then used to select quantizer values for each region in the current frame, e.g., iteratively until the target bit rate for the frame is satisfied to within a specified tolerance range. The selected quantizer values may be modified based on spatial and/or temporal constraints to satisfy spatial requirements of the video compression algorithm and/or to provide temporal smoothness in quality, respectively.
|