发明名称 Automated segmentation tuner
摘要 A system, method, and computer program product are provided for automatically segmenting input document images into regions of black text, white space, and image content. A set of scanned training documents representing the range of text and images to be processed is coarsely tagged to classify regions by content type. The training images are divided into bricks, parameters describing individual brick features are evaluated, and the bricks are classified according to the parameter values. A classification map that relates parameter values to classification codes describing content type is constructed by generating linear equations separating a parameter space into parameter regions along classification boundaries. After training, input documents are scanned and divided into bricks, and brick parameters are converted into an index into the classification map, to classify document regions by content.
申请公布号 US9070011(B2) 申请公布日期 2015.06.30
申请号 US201113163524 申请日期 2011.06.17
申请人 CSR Imaging US, LP 发明人 Andree Fred W.;Schuneman Thomas A.
分类号 G06K9/34;G06K9/00;H04N1/40 主分类号 G06K9/34
代理机构 Vorys, Sater, Seymour and Pease LLP 代理人 Vorys, Sater, Seymour and Pease LLP ;DeLuca Vincent M
主权项 1. A method for processing an image region of an image having a plurality of pixels, comprising: dividing the image region into a plurality of bricks of pixels; calculating parameters of each of the plurality of bricks; applying in a computer system the calculated parameters of each brick to a plurality of brick parameter separation equations to obtain a result for each of the plurality of brick parameter separation equations; computing an index for each brick by concatenating the brick parameter separation equation results for that brick; performing a classification map lookup using the index; and classifying the brick as representing white space, black text, or image content according to the result of the classification map lookup; whereby the image region is segmented into sub-regions of white space, black text, and image content according to the classification of said bricks.
地址 Burlington MA US