发明名称 Text recognition by predictive composed shapes
摘要 A top-down technique for character text recognition of an image comprises a left-to-right analysis of each image line. A current image portion is selected. Possible text prefixes are selected from a dictionary. The upper and lower text contours of the text prefixes are compared with a bitmap of the current image portion. A distance value is generated, indicating the quality of the comparison. The prefixes are then added to an agenda of prefixes. Based on the distance value, corresponding to the similarity of the upper shapes and lower shapes of the possible prefix to the bitmap of the image portion, a list of the text prefixes generating the best distance values is selected from the agenda. From the selected list, a new list of extended text prefixes is obtained from the dictionary and added to the agenda. The process is repeated until the current image portion ends. At this point, the possible text prefix having the best total distance value is selected as the list of text characters corresponding to the image portion. The total distance value is the sum of all of the distance values of the text characters forming the text prefix. Possible text words are selected from the agenda based on beam searching techniques against either a threshold or by limiting the number of possible text prefixes selected to a predetermined number of the currently most probable text prefixes.
申请公布号 US5524066(A) 申请公布日期 1996.06.04
申请号 US19940220861 申请日期 1994.03.31
申请人 XEROX CORPORATION 发明人 KAPLAN, RONALD M.;BOBROW, DANIEL G.
分类号 G06K9/34;G06K9/00;G06K9/46;G06K9/62;G06K9/72;(IPC1-7):G06K9/72 主分类号 G06K9/34
代理机构 代理人
主权项
地址