发明名称 Derivation of probabilistic score for audio sequence alignment
摘要 A match score provides a semantically-meaningful quantification of the aural similarity of two chromae from two corresponding audio sequences. The match score can be applied to the chroma pairs of two corresponding audio sequences, and is independent of the lengths of the sequences, thereby permitting comparisons of matches across subsequences of different length. Accordingly, a single cutoff match score to identify “good” audio subsequence matches can be determined and has both good precision and good recall metrics. A function for determining the match score is determined by establishing a function PM indicating probabilities that chroma correspondence scores indicate semantic correspondences, and a function PR indicating probabilities that chroma correspondence scores indicate random correspondences, repeatedly updating PM and the match function based on existing values of PM and the match function as applied to audio subsequences with known semantic correspondences.
申请公布号 US9384758(B2) 申请公布日期 2016.07.05
申请号 US201514754539 申请日期 2015.06.29
申请人 Google Inc. 发明人 Anders Pedro Gonnet
分类号 G10L19/12;G10L25/51 主分类号 G10L19/12
代理机构 Fenwick & West LLP 代理人 Fenwick & West LLP
主权项 1. A computer-implemented method for matching audio sequences, the method performed by a computer processor and comprising: deriving, by the computer processor, a first probability density function PM outputting a probability that an initial correspondence score for a pair of chroma vectors of an audio sequence indicates a semantic correspondence between the chroma vectors; deriving, by the computer processor, a second probability density function PR outputting a probability that the initial correspondence score for a pair of chroma vectors of an audio sequence indicates that the chroma vectors have a random correspondence, the deriving of PR comprising: randomly selecting a set of pairs of audio sequences;deriving initial correspondence scores for the set of pairs of audio sequences; andfitting the initial correspondence scores to a probability distribution; deriving, by the computer processorusing PM and PR, a match function indicating whether a given pair of chroma vectors of an audio sequence correspond semantically; obtaining a first audio sequence; comparing, by the computer processorusing the match function, the first audio sequence with a plurality of known audio sequences; and based on the comparing, identifying, by the computer processor, a best-matching audio sequence for the first audio sequence from the known audio sequences.
地址 Mountain View CA US