摘要 |
A sound signal processing device that is capable of suitably extracting main sound from mixed sound in which unnecessary sound (for example, leakage sound and reverberant sound) is mixed with the main sound. More specifically, a mixed sound signal in the time domain including first sound and second sound, and a target sound signal in the time domain including sound corresponding to at least the second sound, which have temporal relation in their entirety or in part, are each divided into a plurality of frequency bands. A level ratio between the two signals is calculated at each frequency. Based on the level ratio, a signal of the first sound that is included in the mixed sound signal is extracted. |