发明名称 Methods and apparatus for detecting a repetitive pattern in a sequence of audio frames
摘要 Methods and apparatus for detecting a repetitive pattern in a sequence of audio frames are described. Similarity values of a first similarity matrix with first resolution for the sequence are calculated. An adaptive threshold is estimated from the similarity values for classifying the similarity values into repetition or non-repetition. For each of one or more offsets of a second similarity matrix with second resolution higher that the first resolution, similarity values of the second similarity matrix corresponding to the offset are calculated. Then the calculated similarity values are binarized with the adaptive threshold to obtain binarized data. Finally, the repetitive pattern is detected from the binarized data. The requirement on memory may be reduced because less data are stored in detecting the repetitive pattern.
申请公布号 US9547715(B2) 申请公布日期 2017.01.17
申请号 US201213564302 申请日期 2012.08.01
申请人 Dolby Laboratories Licensing Corporation 发明人 Lu Lie;cheng Bin
分类号 G06F17/30;G10L25/51;G10L25/03;G10L25/45 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method of detecting a repetitive pattern, the method comprising: receiving, by an apparatus including an embedded system having a processor and a memory, a sequence of audio frames; calculating, by the processor, similarity values of a first similarity matrix for the sequence, comprising: for each of a plurality of offsets of the first similarity matrix, calculating a significant score for evaluating the possibility of detecting the repetitive pattern corresponding to the offset based on the calculated similarity values of the first similarity matrix corresponding to the offset;comparing the significant score with a threshold associated with the offset;if the significant score is greater than the threshold, determining the offset as a significant offset; andstoring the similarity values of the first similarity matrix corresponding to the significant offsets in a buffer; estimating, by the processor, an adaptive threshold from the similarity values for classifying the similarity values into repetition or non-repetition; for a second similarity matrix with the same resolution as the first similarity matrix, wherein the second similarity matrix includes more than one offset, reading the similarity values of the first similarity matrix corresponding to the significant offsets from the buffer as the similarity values of the second similarity matrix;classifying the read similarity values with the adaptive threshold to obtain binarized data; anddetecting the repetitive pattern from the binarized data, wherein detecting the repetitive pattern detects a music chorus in the sequence of audio frames, wherein the processor calculates the significant score sig(l) according to an equation:sig⁡(l)=maxt⁢1W⁢∑i=1W⁢s⁡(t+i,l) wherein l is the offset and W is a window length, and wherein the processor calculates the threshold Th(l) associated with the offset according to an equation:Th⁡(l)=∑t=l-k1l+k2⁢w⁡(t)⁢sig⁡(t) wherein l is the offset, w(t) is a weighting function set to 1/(k1+k2+1), sig(t) is the significant score, 0≦k1≦1, 0≦k2, and k1+k2≠0.
地址 San Francisco CA US