摘要 |
<p>A continuous speech analyzer (Fig. 1) is adapted to recognize an utterance (101) as a series string of reference words (130) for which acoustic feature signals are stored (105). Responsive to the utterance (103) and reference word acoustic features (105), at least one reference word series is generated as a candidate for the utterance. Successive word positions for the utterance are identified. In each word position, partial candidate series are generated by a dynamic time WARP partitioning circuit (110) determining a distance signal reference corresponding to a prescribed similarity of utterance segment intervals and reference template involving a partial candidate series of the preceding word position. The candidate utterance segments (130) have beginning points within a predetermined range of the utterance position endpoint for the preceding word position candidate series to account for coarticulation and differences between acoustic features of the utterance and those for reference words (105) spoken in isolation. A minimum distance signal (170) selected from a plurality of partial candidates identifies the candidate string closest to the utterance. </p> |