发明名称 Speaker state detecting apparatus and speaker state detecting method
摘要 A speaker state detecting apparatus comprises: an audio input unit for acquiring, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a speech interval detecting unit for detecting an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; a state information extracting unit for extracting state information representing a state of the first speaker from the first speech period; and a state detecting unit for detecting the state of the first speaker in the first speech period based on the overlap period or the interval and the first state information.
申请公布号 US9002704(B2) 申请公布日期 2015.04.07
申请号 US201213365662 申请日期 2012.02.03
申请人 Fujitsu Limited 发明人 Kamano Akira
分类号 G10L15/00;G10L17/00;G10L25/63;G10L17/26 主分类号 G10L15/00
代理机构 Fujitsu Patent Center 代理人 Fujitsu Patent Center
主权项 1. A speaker state detecting apparatus comprising: an audio input unit which acquires, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a storage unit which stores a state affection model with respect to a set of an overlap period or interval between two speech periods that are temporally continuous and a state of a speaker who has emanated a voice in a preceding speech period of the two speech periods, the model including probabilities of respective possible states which a speaker who has emanated a voice in a later speech period of the two speech periods can have; and a processor adapted to detect an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; extract first state information representing a state of the first speaker from the first speech period and second state information representing a state of the second speaker from the second speech period; and detect the state of the first speaker in the first speech period based on the overlap period or the interval and the first and second state information, the detecting the state of the first speaker comprising: detecting a state of the second speaker in the second speech period based on the second state information;detecting a state of the first speaker in the first speech period based on the first state information;deriving a degree of accuracy representing a likelihood of the state of the first speaker; anddetermining the detected state of the first speaker to be a state of the first speaker in the first speech period when the degree of accuracy is higher than a redetermination threshold value, and when the degree of accuracy is equal to or lower than the redetermination threshold value, obtaining probabilities of the possible states which the first speaker can have, corresponding to a set of the overlap period or the interval and the state of the second speaker in the second speech period in accordance with the state affection model and determining a state for which the probability is the maximum, of the possible states which the first speaker can have, to be the state of the first speaker in the first speech period.
地址 Kawasaki JP