发明名称 Automatic labeling and control of audio algorithms by audio recognition
摘要 Controlling a multimedia software application using high-level metadata features and symbolic object labels derived from an audio source, wherein a first-pass of low-level signal analysis is performed, followed by a stage of statistical and perceptual processing, followed by a symbolic machine-learning or data-mining processing component is disclosed. This multi-stage analysis system delivers high-level metadata features, sound object identifiers, stream labels or other symbolic metadata to the application scripts or programs, which use the data to configure processing chains, or map it to other media. Embodiments of the invention can be incorporated into multimedia content players, musical instruments, recording studio equipment, installed and live sound equipment, broadcast equipment, metadata-generation applications, software-as-a-service applications, search engines, and mobile devices.
申请公布号 US9031243(B2) 申请公布日期 2015.05.12
申请号 US201012892843 申请日期 2010.09.28
申请人 iZotope, Inc. 发明人 LeBoeuf Jay;Pope Stephen
分类号 H04R29/00 主分类号 H04R29/00
代理机构 代理人 Lowry David
主权项 1. A non-transitory computer-readable storage medium having embodied thereon a program, the program being executable by a processor to perform a method for multi-stage audio signal analysis, the method comprising: performing a first-stage processing operation on an audio signal, the first stage processing operation including a windowed signal analysis to calculate from the audio signal statistical descriptor features that are stored in a raw feature vector; performing a second stage statistical processing operation on the raw feature vector to derive a reduced feature vector; performing a third stage processing operation on the reduced feature vector to derive at least one sound object label that refers to the original audio signal; and mapping the at least one sound object label into a stream of control events sent to a sound-object-driven, multimedia-aware software application, wherein the sound-object-driven multimedia-aware software application is responsive to the stream of control events to configure processing for the audio signal, and wherein any of the processing operations of the first through third stages are configurable.
地址 Cambridge MA US