发明名称 Physical page layout analysis via tab-stop detection for optical character recognition
摘要 Physical page layout analysis for optical character recognition is performed. A physical page layout analysis method finds constituent parts of an image and gives an initial data-type label, such as text or non-text. Within the text data, connected components are identified and analyzed. Tab-stops are detected from groups of edge-aligned connected components. The detected tab-stops are used to deduce the column layout of the page by finding column partitions. The column layout is then applied to find the polygonal boundaries of and a reading order of regions containing flowing text, headings, and pull-outs.
申请公布号 US8249356(B1) 申请公布日期 2012.08.21
申请号 US20090357004 申请日期 2009.01.21
申请人 SMITH RAYMOND;GOOGLE INC. 发明人 SMITH RAYMOND
分类号 G06K9/48 主分类号 G06K9/48
代理机构 代理人
主权项
地址