发明名称 METHODS AND SYSTEMS FOR EFFICIENT AUTOMATED SYMBOL RECOGNITION
摘要 The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify a subset of the total number of symbols frequently used in the scanned document image or images. One or more lists of graphemes for the language of the text are then ordered in most-likely-occurring to least-likely-occurring order to facilitate a second optical-character-recognition step in which symbol images extracted from the one or more scanned-document images are associated with one or more graphemes most likely to correspond to the scanned symbol image.
申请公布号 US2015213330(A1) 申请公布日期 2015.07.30
申请号 US201414508492 申请日期 2014.10.07
申请人 ABBYY Development LLC 发明人 Chulinin Yuri
分类号 G06K9/62;G06K9/82;G06F17/28;G06K9/78 主分类号 G06K9/62
代理机构 代理人
主权项 1. An optical-symbol-recognition system comprising: one or more processors; one of more data-storage devices; and computer instructions, stored in one or more of the one or more data-storage devices that, when executed by one or more of the one or more processors, control the optical-symbol-recognition system to process a text-containing scanned image by: identifying symbol images from the text-containing scanned image;for each identified symbol image, preprocessing the symbol image to identify graphemes, associated with symbol patterns, that have a computed level of similarity to the symbol image above a threshold level of similarity, andsorting the identified graphemes by computed level of similarity; andusing the sorted, identified graphemes to generate symbol encodings for the symbol images that are stored in one or more of the one of more data-storage devices.
地址 Moscow RU