发明名称 Method and apparatus for voice communication based on voice activity detection
摘要 Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
申请公布号 US9571425(B2) 申请公布日期 2017.02.14
申请号 US201314384327 申请日期 2013.03.21
申请人 Dolby Laboratories Licensing Corporation 发明人 Dickins Glenn N.;Sun Xuejing;Costa Brendon
分类号 H04L12/66;H04L12/861;H04M3/56;G10L25/78;G10L19/16;H04L29/06 主分类号 H04L12/66
代理机构 Fountainhead Law Group, P.C. 代理人 Hamilton Charles L.;Fountainhead Law Group, P.C.
主权项 1. A method of performing voice communication based on voice activity detection, comprising: acquiring audio blocks in sequence, wherein each of the audio blocks includes one or more audio frames; performing voice activity detection on the audio blocks; and in response to deciding voice onset for a present one of the audio blocks, retrieving a subsequence of the sequence of the acquired audio blocks, including a number of audio blocks which precede the present audio block immediately, wherein the subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence; andtransmitting the present audio block and the audio blocks in the subsequence to a receiving party, wherein the audio blocks in the subsequence are identified as reprocessed audio blocks to inform the receiving party that these audio blocks are different from the present audio block and reprocessed as including voice; and in response to deciding non-voice for the present audio block, caching the present audio block, wherein before the step of transmitting, the method further comprises: reprocessing the subsequence by regarding the earliest audio block in the subsequence as voice onset and regarding each following audio block in the subsequence and the present audio block as voice continuation.
地址 San Francisco CA US