发明名称 Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications
摘要 A method for segmenting audio data, comprising speech from a plurality of individual speakers, according to speaker is provided. The method comprises providing individual HMMs for each individual speaker, each individual HMM including at least one state, and constructing a speaker network HMM by connecting the individual HMMs in parallel. The audio data is then divided into segments by determining a most likely sequence of states through the speaker network HMM, each of the segments being associated with one of the individual HMMs. Afterward, the speaker of each of the segments is identified. The segmented data may be used to form an index into the audio data according to speaker.
申请公布号 US5655058(A) 申请公布日期 1997.08.05
申请号 US19940226519 申请日期 1994.04.12
申请人 XEROX CORPORATION 发明人 BALASUBRAMANIAN, VIJAY;CHEN, FRANCINE R.;CHOU, PHILIP A.;KIMBER, DONALD G.;POON, ALEX D.;WEBER, KARON A.;WILCOX, LYNN D.
分类号 G10L15/04;G10L15/10;G10L15/14;G10L17/00;H04R3/00;(IPC1-7):G10L5/06;G10L9/00 主分类号 G10L15/04
代理机构 代理人
主权项
地址