发明名称 Frequency-domain post-filtering voice-activity detector
摘要 A voice-activity detector (VAD 104) takes (214) a currently-received set and a previously-received set of samples of a time-domain (voice) signal, converts (216) them into a frequency-domain representation of the signal, filters out (218) negative and low (noise) frequencies, weights (220) the energies of frequency bins (ranges) of the remaining frequencies proportionately to their frequencies, and computes (220) the total power of the ranges. It first initializes (226) by determining (304, 306) if power peaks of any of the ranges exceed a first threshold (ceiling 228); if not, it lowers (302) the ceiling and continues initializing, and if so, it ends initializing (308), indicates (334) that voice has been detected, sets (330) the ceiling to the highest peak, and stores (332) the total power as a "smoothed" power. If initialization has ended, it determines (320, 322) if power peaks of any of the ranges exceed a second threshold that is a fraction of the ceiling; if so, it indicates (334) that voice has been detected, sets (330) the ceiling to the highest peak that exceeds the ceiling, and computes (332) a new "smoothed" power as a function of the total power and the current "smoothed" power. If initialization has ended and energy peaks of none of the ranges exceed the second threshold, it determines (340, 342) if a ratio of the total power and the smoothed power exceeds a third threshold; if so, it indicates (344) that voice has been detected, and if not, it indicates (346) that voice has not been detected.
申请公布号 US2002103636(A1) 申请公布日期 2002.08.01
申请号 US20010770922 申请日期 2001.01.26
申请人 TUCKER LUKE A.;WILDIE MARK GREIG 发明人 TUCKER LUKE A.;WILDIE MARK GREIG
分类号 G10L11/02;(IPC1-7):G10L21/00 主分类号 G10L11/02
代理机构 代理人
主权项
地址