发明名称 DOCUMENT IMAGE COMPRESSION METHOD AND ITS APPLICATION IN DOCUMENT AUTHENTICATION
摘要 A method for compressing a bi-level document image containing text is disclosed. The document image is segmented into symbol images each representing a letter, numeral, etc. in the document. The symbol images are classified into a plurality of classes, each class being associated with a template image and a class index. Classification is done by comparing each symbol to be classified with template of existing classes, using a number of image features including zoning profiles, side profiles, topology statistics, and low-order image moments. These image features are compared using a tolerance based method to determine whether the symbol matches the template. After classification, certain classes that have few symbols classified into them may be merged with other classes. In addition, the template images of the classes are down-sampled, where the final sizes of the template images are dependent on the likelihood of confusion of the template with other templates.
申请公布号 US2014185933(A1) 申请公布日期 2014.07.03
申请号 US201213730757 申请日期 2012.12.28
申请人 Tian Yibin;Ming Wei 发明人 Tian Yibin;Ming Wei
分类号 G06K9/62 主分类号 G06K9/62
代理机构 代理人
主权项 1. A method for compressing a binary image representing a document containing text regions, the method comprising: (a) segmenting the text regions into a plurality of symbol images, each symbol image representing a symbol of text, each symbol image being bound by a bounding box having a location and a size; (b) classifying each symbol image obtained in step (a) into one of a plurality of classes, each class being represented by a template image and a class index, including, for each symbol image being classified: (b1) comparing the symbol image with each template image to determine whether they match each other, including comparing a plurality of features of the symbol image with the corresponding plurality of features of the template image, the plurality of features including density statistics features, side profile features, topology statistics features and shape features;(b2) if a match is found in step (b1), recording the class index corresponding to the matched template in association with the symbol image being classified; and(b3) if no match is found in step (b1), adding a new class to the plurality of classes, by using the image of the symbol image being classified as the template image of the new class and assigning a class index to the new class, and recording the class index in association with the symbol image being classified; (c) resizing the template image of each class to a final size; and (d) storing, as compressed image data, the resized template image for each of the plurality of classes along with its class index, the bounding box location and size for each symbol image obtained in step (a), and the class index for each symbol image obtained in step (b2) or (b3).
地址 Menlo Park CA US
您可能感兴趣的专利