摘要 |
Robust media fingerprints are derived from a portion of audio content. A portion of content in an audio signal is categorized. The audio content is characterized based, at least in part, on one or more of its features. The features may include a component that relates to one of several sound categories, e.g., speech and/or noise, which may be mixed with the audio signal. Upon categorizing the audio content as free of the speech or noise related components, the audio signal component is processed. Upon categorizing the audio content as including the speech related component and/or the noise related components, the speech or noise related components are separated from the audio signal. The audio signal is processed independent of the speech related component and/or the noise related component. Processing the audio signal includes computing the audio fingerprint, which reliably corresponds to the audio signal. |