发明名称 NEURAL NETWORK VOICE ACTIVITY DETECTION EMPLOYING RUNNING RANGE NORMALIZATION
摘要 A “running range normalization” method includes computing running estimates of the range of values of features useful for voice activity detection (VAD) and normalizing the features by mapping them to a desired range. Running range normalization includes computation of running estimates of the minimum and maximum values of VAD features and normalizing the feature values by mapping the original range to a desired range. Smoothing coefficients are optionally selected to directionally bias a rate of change of at least one of the running estimates of the minimum and maximum values. The normalized VAD feature parameters are used to train a machine learning algorithm to detect voice activity and to use the trained machine learning algorithm to isolate or enhance the speech component of the audio data.
申请公布号 US2016093313(A1) 申请公布日期 2016.03.31
申请号 US201514866824 申请日期 2015.09.25
申请人 CYPHER, LLC 发明人 Vickers Earl
分类号 G10L21/0264;G10L21/0224;G10L25/84;G10L25/30;G10L25/60 主分类号 G10L21/0264
代理机构 代理人
主权项 1. A method of obtaining normalized voice activity detection features from an audio signal comprising the steps of: at a computing system, dividing an audio signal into a sequence of time frames; computing one or more voice activity detection feature of the audio signal for each of the time frames; computing running estimates of minimum and maximum values of the one or more voice activity detection feature of the audio signal for each of the time frames; computing input ranges of the one or more voice activity detection feature by comparing the running estimates of the minimum and maximum values of the one or more voice activity detection feature of the audio signal for each of the time frames; and mapping the one or more voice activity detection feature of the audio signal for each of the time frames from the input ranges to one or more desired target range to obtain one or more normalized voice activity detection feature.
地址 South Jordan UT US