摘要 |
Disclosed herein is an audio processing apparatus for processing a plurality of pieces of audio data of sounds picked up by a plurality of microphones. The apparatus includes: a speaker identification section configured to identify a speaker based on the audio data; a simultaneous speech section identification section configured to, when at least first and second speakers have been identified, identify speech sections during which the first and second speakers have made speeches, and identify a section during which the first and second speakers have made the speeches at the same time as a simultaneous speech section; and an arranging section configured to separate audio data of the first speaker and audio data of the second speaker from the simultaneous speech section, and allow the audio data of the first speaker and the audio data of the second speaker to be outputted at mutually different timings. |