发明名称 Coherent Pitch and Intensity Modification of Speech Signals
摘要 A method comprising: receiving an utterance, an original pitch contour of the utterance, and a target pitch contour for the utterance, wherein the utterance comprises a plurality of consecutive frames, and wherein at least one of said frames is a voiced frame; calculating an original intensity contour of said utterance; generating a pitch modified utterance based on the target pitch contour; calculating an intensity modification factor for each of said frames, based on said original pitch contour and on said target pitch contour, to produce a sequence of intensity modification factors corresponding to said plurality of consecutive frames; calculating a final intensity contour for said utterance by applying said intensity modification factors to said original intensity contour; and generating a coherently modified speech signal by time dependent scaling of the intensity of said pitch modified utterance according to said final intensity contour.
申请公布号 US2017092285(A1) 申请公布日期 2017.03.30
申请号 US201615378100 申请日期 2016.12.14
申请人 International Business Machines Corporation 发明人 Sorin Alexander
分类号 G10L21/013;G10L25/24;G10L13/033;G10L15/08 主分类号 G10L21/013
代理机构 代理人
主权项 1. A method comprising: operating one or more hardware processors for receiving an utterance embodied as digitized speech signal, an original pitch contour of the utterance, and a target pitch contour for the utterance, wherein the utterance comprises a plurality of consecutive frames, and wherein at least one of said frames is a voiced frame; operating at least one of said one or more hardware processors for calculating an original intensity contour of said utterance; operating at least one of said one or more hardware processors for generating a pitch-modified utterance based on the target pitch contour; operating at least one of said one or more hardware processors for calculating an intensity modification factor for each of said frames, based on said original pitch contour and on said target pitch contour, to produce a sequence of intensity modification factors corresponding to said plurality of consecutive frames, wherein each of the intensity modification factors is ten in the power of the twentieth of the ratio of average empirical decibels per octave multiplied by the extent of pitch modification expressed in octaves; operating at least one of said one or more hardware processors for calculating a final intensity contour for said utterance by applying said intensity modification factors to said original intensity contour; and operating at least one of said one or more hardware processors for generating a coherently-modified speech signal by time-dependent scaling of the intensity of said pitch-modified utterance according to said final intensity contour.
地址 Armonk NY US