发明名称 Speech processing device and speech processing method
摘要 A speech processing device which can accurately extract a conversation group from among a plurality of speakers, even when a conversation group formed of three or more people is present. This device (400) comprises: a spontaneous speech detection unit (420) and a direction-specific speech detection unit (430) which separately detect, from a sound signal, uttered speech from the speakers; a conversation establishment level calculation unit (450) which calculates a conversation establishment level for each separated segment of the time being determined, for all of the pairings of two people, on the basis of the detected uttered speech; an extended-period characteristic amount calculation unit (460) which calculates an extended-period characteristic amount for the conversation establishment level of the time being determined, for each pairing; and a conversation-partner determination unit (470) which extracts a conversation group which forms a conversation on the basis of the calculated extended-period characteristic amount.
申请公布号 US9064501(B2) 申请公布日期 2015.06.23
申请号 US201113816502 申请日期 2011.09.14
申请人 Panasonic Intellectual Property Management Co., Ltd. 发明人 Yamada Maki;Endo Mitsuru
分类号 G10L11/06;G10L15/20;G10L21/00;G10L25/48;G10L25/00;H04R25/00;G10L25/78;G10L25/06;G10L21/0208;G10L21/06 主分类号 G10L11/06
代理机构 Wenderoth, Lind & Ponack, L.L.P. 代理人 Wenderoth, Lind & Ponack, L.L.P.
主权项 1. A speech processing device, comprising: a speech detector that detects speech of individual speakers from acoustic signals; a total-amount-of-speech calculator that calculates, for each of all pairs of the speakers and for each of segments defined by dividing a determination time period, a total amount of speech on the basis of the detected speech, the total amount of speech being a sum of amounts of speech of the pair of speakers in the segment; an established-conversation calculator that calculates, for each of the pairs of the speakers and for each of the segments, a degree of established conversation on the basis of the detected speech, the degree of established conversation being a value indicating a rate of a time when one of the pair of the speakers gives speech and the other of the pair of the speakers gives no speech; a long-time feature calculator that calculates, for each of the pairs of the speakers, a long-time feature obtained by integrating the degrees of established conversation calculated for the pair of the speakers within the determination time period; and a conversational-partner determining unit that extracts a conversation group holding conversation from the speakers, on the basis of the calculated long-time features, wherein the established-conversation calculator excludes, for each of the pairs of the speakers, the degree of established conversation of the segment with the sum of amounts of speech lower than a first threshold from the calculation of the long-time feature for the pair of the speakers, and the conversational-partner determining unit determines that the speakers of the pair with the long-time feature greater than or equal to a second threshold belong to the same conversation group.
地址 Osaka JP