摘要 |
In the speech recognizer disclosed herein, alignment of an unknown speech sediment, represented by a finely gradiated sequence of frames, with a model sediment represented by a sequence of states is performed by first preparing respective coarse sequences representing the unknown and model segments thereby to define a coarse matrix representing possible alignments. The fine sequences correspondingly define a fine matrix. A best alignment of the coarse sequences is determined thereby to define a coarse path through the coarse matrix. The coarse path is overlaid on the fine matrix and a corridor is defined which includes fine matrix locations which lie within a preselected metric of the coarse path. Only transitions within the corridor are calculated in determining the fine alignment of the unknown speech segment with the model segment, thereby significantly reducing the number of computations required.
|