发明名称 Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction
摘要 A structured generative model of a speech coarticulation and reduction is described with a novel two-stage implementation. At the first stage, the dynamics of formants or vocal tract resonance (VTR) are generated using prior information of resonance targets in the phone sequence. Bi-directional temporal filtering with finite impulse response (FIR) is applied to the segmental target sequence as the FIR filter's input. At the second stage the dynamics of speech cepstra are predicted analytically based on the FIR filtered VTR targets. The combined system of these two stages thus generates correlated and causally related VTR and cepstral dynamics where phonetic reduction is represented explicitly in the hidden resonance space and implicitly in the observed cepstral space. The combined system also gives the acoustic observation probability given a phone sequence. Using this probability, different phone sequences can be compared and ranked in terms of their respective probability values. This then permits the use of the model for phonetic recognition.
申请公布号 US2006200351(A1) 申请公布日期 2006.09.07
申请号 US20050069474 申请日期 2005.03.01
申请人 MICROSOFT CORPORATION 发明人 ACERO ALEJANDRO;YU DONG;DENG LI
分类号 G10L15/04 主分类号 G10L15/04
代理机构 代理人
主权项
地址