发明名称 Systems and methods for segmenting and/or classifying an audio signal from transformed audio information
摘要 A system and method may be provided to segment and/or classify an audio signal from transformed audio information. Transformed audio information representing a sound may be obtained. The transformed audio information may specify magnitude of a coefficient related to energy amplitude as a function of frequency for the audio signal and time. Features associated with the audio signal may be obtained from the transformed audio information. Individual ones of the features may be associated with a feature score relative to a predetermined speaker model. An aggregate score may be obtained based on the feature scores according to a weighting scheme. The weighting scheme may be associated with a noise and/or SNR estimation. The aggregate score may be used for segmentation to identify portions of the audio signal containing speech of one or more different speakers. For classification, the aggregate score may be used to determine a likely speaker model to identify a source of the sound in the audio signal.
申请公布号 US9601119(B2) 申请公布日期 2017.03.21
申请号 US201414481918 申请日期 2014.09.10
申请人 KnuEdge Incorporated 发明人 Bradley David C.;Hilton Robert N.;Goldin Daniel S.;Fisher Nicholas K.;Roos Derrick R.;Wiewiora Eric
分类号 H04R29/00;G10L17/02;H04R3/00;G10L25/51;G10L25/84 主分类号 H04R29/00
代理机构 Edell, Shapiro & Finnan, LLC 代理人 Edell, Shapiro & Finnan, LLC
主权项 1. A system configured for segmenting an audio signal to identify portions of the audio signal containing speech of one or more different speakers, the system comprising: one or more processors configured by computer readable instructions to: obtain transformed audio information representing a sound including determining harmonic paths for individual harmonics of the sound based on fractional chirp rate and harmonic number, wherein the transformed audio information specifies magnitude of a coefficient related to energy amplitude as a function of frequency for the audio signal and time; obtain features associated with the audio signal from the transformed audio information, individual ones of the features being associated with a feature score; and obtain an aggregate score based on the feature scores, the aggregate score being used for segmentation to identify portions of the audio signal containing speech of one or more different speakers.
地址 San Diego CA US