发明名称 |
AUDIO PROCESSING APPARATUS AND AUDIO PROCESSING METHOD |
摘要 |
An audio processing apparatus includes a first-section detection unit configured to detect a first section that is a section in which the power of a spatial spectrum in a sound source direction is higher than a predetermined amount of power on the basis of an audio signal of a plurality of channels, a speech state determination unit configured to determine a speech state on the basis of an audio signal within the first section, a likelihood calculation unit configured to calculate a first likelihood that a type of sound source according to an audio signal within the first section is voice and a second likelihood that the type of sound source is non-voice, and a second-section detection unit configured to determine whether or not a second section in which power is higher than average the power of a speech section is a voice section on the basis of the first likelihood and the second likelihood within the second section. |
申请公布号 |
US2017040030(A1) |
申请公布日期 |
2017.02.09 |
申请号 |
US201615193481 |
申请日期 |
2016.06.27 |
申请人 |
HONDA MOTOR CO., LTD. |
发明人 |
Nakamura Keisuke;Nakadai Kazuhiro |
分类号 |
G10L25/93;G10L25/78;G10L25/21 |
主分类号 |
G10L25/93 |
代理机构 |
|
代理人 |
|
主权项 |
1. An audio processing apparatus comprising:
a first-section detection unit configured to detect a first section that is a section in which a power of a spatial spectrum in a sound source direction is higher than a predetermined amount of power on the basis of an audio signal of a plurality of channels; a speech state determination unit configured to determine a speech state on the basis of an audio signal within the first section; a likelihood calculation unit configured to calculate a first likelihood that a type of sound source according to an audio signal within the first section is voice and a second likelihood that the type of sound source is non-voice; and a second-section detection unit configured to determine whether or not a second section in which power is higher than an average power of a speech section is a voice section on the basis of the first likelihood and the second likelihood within the second section. |
地址 |
Tokyo JP |