发明名称 A MEGA SPEAKER IDENTIFICATION (ID) SYSTEM AND CORRESPONDING METHODS THEREFOR
摘要 <p>A memory storing computer readable instructions for causing a processor associated with a mega speaker identification (ID) system to instantiate functions including an audio segmentation and classification function (F10) receiving general audio data (GAD) and generating segments, a feature extraction function (F12) receiving the segments and extracting features based on mel-frequency cepstral coefficients (MFCC) therefrom, a learning and clustering function (14) receiving the extracted features and reclassifying segments, when required, based on the extracted features, a matching and labeling function (16) assigning a speaker ID to speech signals within the GAD, and a database function for correlating the assigned speaker ID to the respective speech signals within the GAD. The audio segmentation and classification function can assign each segment to one of N audio signal classes including silence, single speaker speech, music, environmental noise, multiple speaker's speech, simultaneous speech and music, and speech and noise.</p>
申请公布号 WO2004001720(A1) 申请公布日期 2003.12.31
申请号 WO2003IB02429 申请日期 2003.06.04
申请人 KONINKLIJKE PHILIPS ELECTRONICS N.V.;U.S. PHILIPS CORPORATION 发明人 DIMITROVA, NEVENKA;LI, DONGGE
分类号 G10L11/00;G10L11/02;G10L15/00;G10L15/04;G10L15/10;G10L17/00;(IPC1-7):G10L17/00 主分类号 G10L11/00
代理机构 代理人
主权项
地址