发明名称 Method and system for semantically segmenting an audio sequence
摘要 An audio segmentation method and system which automatically segments an audio sequence into audio scenes of similar semantic content is described. The method and system initially splits the audio sequence into segments of arbitrary length (step 101). Next, each segment is subject to short term spectral analysis (step 102) to generate feature vectors characterising the audio. A vector quantisation (VQ) technique is used to generate a signature codeboook using the feature vectors of the audio segments (step 103). An Earth Mover's Distance (EMD) measure is then used to calculate distances between consecutive audio segments (step 104). By statistically analysing the respective (EMD) measures to identify peaks therein, changes in the dominant audio content can be detected indicative of audio scene changes (step 105). In this way, it is possible to automate the time-consuming and laborious process of organising and indexing increasingly large audio databases such that they can be easily browsed and searched using natural query structures.
申请公布号 GB0406500(D0) 申请公布日期 2004.04.28
申请号 GB20040006500 申请日期 2004.03.23
申请人 BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANY 发明人
分类号 G10L11/00 主分类号 G10L11/00
代理机构 代理人
主权项
地址