发明名称 Segment-based similarity method for low complexity speech recognizer
摘要 A digital word prototype is constructed using one or more speech utterance for a given spoken word or phrase. First, a phone model is used to derive phoneme similarity time series for each of a plurality of phonemes which represent the degree of similarity between the speech utterance and a set of standard phonemes contained in the phone model. Next, the phoneme similarity data is normalized in relation to a non-speech part of the input speech signal. The normalized phoneme similarity data is divided into segments, such that the sum of all normalized phoneme similarity values in a segment are equal for each segment. Next, a word model is constructed from the phoneme similarity data. To do so, within each segment, a summation value is determined by summing over speech frames each of the normalized phoneme similarity values associated with a particular phoneme. In this way, the word model is represented by a vector of summation values that compactly correlate to the normalized phoneme similarity data. Lastly, the results of the individually processed utterances for a given spoken word (i.e., the individual word models) are combined to produce a digital word prototype that electronically represents the given spoken word.
申请公布号 US6230129(B1) 申请公布日期 2001.05.08
申请号 US19980199721 申请日期 1998.11.25
申请人 MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD. 发明人 MORIN PHILIPPE R.;APPLEBAUM TED H.
分类号 G10L15/18;G10L15/02;G10L15/06;G10L15/10;(IPC1-7):G10L15/02;G10L15/20 主分类号 G10L15/18
代理机构 代理人
主权项
地址