发明名称 Text recognition using two-dimensional stochastic models.
摘要 Pseudo two-dimensional hidden Markov models (HMMs) are used to represent text elements, such as characters or words. Observation vectors for each text element are based on pixel maps obtained by optical scanning. A character is represented by a pseudo two-dimensional HMM having a number of superstates (210-214), with each superstate having at least one state (220-222). Text elements are compared with such models by using the Viterbi algorithm, first in connection with the states in each superstate, then the superstates themselves, to calculate the probability that a particular model represents the text element. Parameters for the models are generated by training routines. Probabilities can be adjusted to compensate for changes in scale, translations, slant, and rotation. An embodiment is also disclosed for identifying keywords in a body of text. A first pseudo two-dimensional HMM is created for the words that may appear in the text. Each word in the text is compared with both models, again using the Viterbi algorithm, to calculate probabilities that the model represents the subject word. If the probability for the keyword is greater than that for the extraneous words, the subject word is identified as being the keyword. Preprocessing steps for reducing the number of words to be compared can be added. <IMAGE>
申请公布号 EP0605099(A3) 申请公布日期 1995.01.18
申请号 EP19930309222 申请日期 1993.11.18
申请人 AT & T CORP 发明人 AGAZZI OSCAR ERNESTO;KUO SHYH-SHIAW
分类号 G06K9/70;G06K9/62 主分类号 G06K9/70
代理机构 代理人
主权项
地址