发明名称 ROBUST SPEECH BOUNDARY DETECTION SYSTEM AND METHOD
摘要 A system for audio processing comprising an initial background statistical model system configured to generate an initial background statistical model using a predetermined sample size of audio data. A parameter computation system configured to generate parametric data for the audio data including cepstral and energy parameters. A background statistics computation system configured to generate preliminary background statistics for determining whether speech has been detected. A first speech detection system configured to determine whether speech was present in the initial sample of audio data. An adaptive background statistical model system configured to provide an adaptive background statistical model for use in continuous processing of audio data for speech detection. A parameter computation system configured to calculate cepstral parameters, energy parameters and other suitable parameters for speech detection. A speech/non-speech classification system configured to classify individual frames as speech frames or non-speech frames, based on the computed parameters and the adaptive background statistical model data. A background statistics update system configured to update the background statistical model based on detected speech and non-speech frames. A second speech detection system configured to perform speech detection processing and to generate a suitable indicator for use in processing audio data that is determined to include speech signals.
申请公布号 US2014249812(A1) 申请公布日期 2014.09.04
申请号 US201414197149 申请日期 2014.03.04
申请人 Conexant Systems, Inc. 发明人 Bou-Ghazale Sahar E.;Thormundsson Trausti;Wu Willie B.
分类号 G10L25/84;G10L25/87 主分类号 G10L25/84
代理机构 代理人
主权项 1. A system for audio processing comprising: an initial background statistical model system operating on a processor and configured to generate an initial background statistical model using an initial sample of audio data; a parameter computation system operating on the processor and configured to generate parametric data for the initial sample of audio data; a background statistics computation system operating on the processor and configured to receive the parametric data and to generate preliminary background statistics for determining whether speech has been detected; and a first speech detection system operating on the processor and configured to determine whether speech was present in the initial sample of audio data using the preliminary background statistics.
地址 Irvine CA US