发明名称 AUDIOVISUAL INFORMATION PROCESSING IN VIDEOCONFERENCING
摘要 Embodiments of the present invention relate to audiovisual stream processing in videoconferences. For each audiovisual stream in a videoconference, a sound level of the audiovisual stream is detected. If the sound level exceeds a predefined threshold level, the audiovisual stream is processed with a first configuration. If the sound level is below the predefined threshold level, the audiovisual stream is processed with a second configuration. The second configuration is more resource-effective than the first configuration.
申请公布号 US2017104961(A1) 申请公布日期 2017.04.13
申请号 US201615364701 申请日期 2016.11.30
申请人 International Business Machines Corporation 发明人 Pan Yang;Su Wei;Zhang Yi;Zhang Yi Jian
分类号 H04N7/15;H04N21/233;H04N5/91 主分类号 H04N7/15
代理机构 代理人
主权项 1. A computer system for processing a plurality of audiovisual streams in a videoconference, the computer system comprising: one or more computer processors; one or more computer readable storage media; program instructions stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the program instructions comprising: program instructions to detect a sound level of an audiovisual stream in a videoconference based on determining an average sound level of the audiovisual stream over a predefined time period, decomposing the audiovisual stream into an audio component and a video component and analyzing the audio component to determine the sound level wherein analyzing the audio component is based on at least one of sound intensity, sound pressure, sound power, sound energy density and sound loudness;in response to the sound level exceeding a first predefined threshold level, program instructions to process the audiovisual stream with a first configuration based on a first quality level wherein exceeding the first predefined threshold level comprises determining that the sound level of the audiovisual stream does not fall below the first predefined threshold level for a sequential time period greater than a predefined threshold time period;in response to the sound level being below the first predefined threshold level and above a second predefined sound level, program instructions to process the audiovisual stream with a second configuration based on a second quality level, wherein the second configuration is more resource-effective than the first configuration and the second quality level is lower than the first quality level wherein the first quality level and the second quality level are based on signal-to-noise ratio and at least one of frequency response, stereo crosstalk or output power;in response to the sound level being below the second predefined sound level, program instructions to discard the audiovisual stream;program instructions to superimpose the audio component of the audiovisual stream with audio components of further audiovisual streams associated with the videoconference wherein the further audiovisual streams are processed with the first configuration;program instructions to combine the video component of the audiovisual stream with video components of the further audiovisual streams; andprogram instructions to render the audiovisual stream in a display area, wherein an appearance of the display area is determined based on the sound level of the audiovisual stream.
地址 Armonk NY US