发明名称 NON-SPEECH SECTION DETECTING METHOD AND NON-SPEECH SECTION DETECTING DEVICE
摘要 <p>A frame generating section (20) of control means (2) generates a frame having a predetermined time length from sound data. A spectrum bias/power/pitch deriving unit (21a) derives at least one of the bias of the spectrum obtained by converting the sound data into a component on the frequency axis, the power of the sound data, and the pitch of the sound data. A variation amount deriving unit (21b) derives the amount of variation of the value derived by the spectrum bias/power/pitch deriving unit (21a) from that of the previous frame. As the bias of the spectrum, the ratio of the first-order autocorrelation function of the sound data to the zero-order autocorrelation function there of is used. If the amount of variation is judged to be a predetermined threshold or less, a non-speech section detecting unit (22b) detects a non-speech section including consecutive frames when the amount of variation is judged to be a predetermined threshold or less and when the number of consecutive frames is a predetermined one or more. The section where the amount of variation is large singly is excluded from the non-speech section. If the section where the amount of variation is large singly is sandwiched between two non-speech sections, the section is detected as a non-speech section irrespective of the judgment.</p>
申请公布号 WO2009078093(A1) 申请公布日期 2009.06.25
申请号 WO2007JP74274 申请日期 2007.12.18
申请人 FUJITSU LIMITED;WASHIO, NOBUYUKI;HAYAKAWA, SHOJI 发明人 WASHIO, NOBUYUKI;HAYAKAWA, SHOJI
分类号 G10L25/78 主分类号 G10L25/78
代理机构 代理人
主权项
地址