摘要 |
PROBLEM TO BE SOLVED: To provide a voice detection device, a voice detection method, and a program, in which a conversation voice constituted by a plurality of incoming signals is subjected to voice detection with high accuracy.SOLUTION: A voice detection device 100 includes: a plurality of voice score calculation parts 102 for calculating, from a plurality of incoming signals, a plurality of voice scores representative of voice likeness in time series; and a voice section determination part 106 for determining voice sections of a plurality of incoming signals by using the voice score and a conversation model 104 for outputting a conversation score representative of speech likeness of incoming signal; wherein the conversation score represents speech likeness of a plurality of state transition series, in which a time-series transition of utterance state in a conversation made by a plurality of speakers is assumed; and the voice section determination part 106 obtains temporarily a plurality of voice sections on the basis of each voice score, assumes a plurality of state transition series from each temporal voice section of a plurality of incoming signals, selects, out of assumed ones, a state transition series exhibiting a most speech likeness by using the conversation model 104, and determines voice sections of a plurality of incoming signals on the basis of the selected state transition series. |