发明名称 Sound signal processing apparatus, sound signal processing method, and program
摘要 A sound signal processing apparatus includes an observed signal analysis unit that receives as an observed signal a sound signal for channels obtained by a sound signal input unit formed of microphones and estimates a sound direction and a sound segment of a target sound which is sound to be extracted and a sound source extraction unit that receives the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracts the sound signal for the target sound. The observed signal analysis unit includes a short time Fourier transform unit that generates an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the channels received and a direction/segment estimation unit that receives the observed signal generated by the short time Fourier transform unit and detects the sound direction and sound segment of the target sound.
申请公布号 US9357298(B2) 申请公布日期 2016.05.31
申请号 US201414221598 申请日期 2014.03.21
申请人 SONY CORPORATION 发明人 Hiroe Atsuo
分类号 H04R29/00;H04R3/00;G10L21/0272;H04R27/00 主分类号 H04R29/00
代理机构 Oblon, McClelland, Maier & Neustadt, L.L.P 代理人 Oblon, McClelland, Maier & Neustadt, L.L.P
主权项 1. A sound signal processing apparatus comprising: an observed signal analysis circuit configured to receive as an observed signal a sound signal for a plurality of channels obtained by a sound signal input unit formed of a plurality of microphones placed at different positions and estimate a sound direction and a sound segment of a target sound which is sound to be extracted; and a sound source extraction circuit configured to receive the sound direction and sound segment of the target sound estimated by the observed signal analysis circuit and extract the sound signal for the target sound, wherein the observed signal analysis circuit includes: a short time Fourier transform circuit configured to generate an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the plurality of channels received; and a direction/segment estimation circuit configured to receive the observed signal generated by the short time Fourier transform circuit and detect the sound direction and sound segment of the target sound, and wherein the sound source extraction circuit is configured to: execute iterative learning in which an extracting filter U′ is iteratively updated using a result of application of the extracting filter to the observed signal, prepare, as a function to be applied in the iterative learning, an objective function G(U′) that assumes a local minimum or a local maximum when a value of the extracting filter U′ is a value optimal for extraction of the target sound, and compute a value of the extracting filter U′ which is in a neighborhood of a local minimum or a local maximum of the objective function G(U′) using an auxiliary function method during the iterative learning, and apply the computed extracting filter to extract the sound signal for the target sound.
地址 Tokyo JP