发明名称 POST-OCR IMAGE SEGMENTATION INTO SPATIALLY SEPARATED TEXT ZONES
摘要 <p>This invention describes a post-recognition procedure to group text recognized by an Optical Character Reader (OCR) from a document image into zones. Once the recognized text and the corresponding word bounding boxes for each word of the text are received, the procedure described dilates (expands) these word bounding boxes by a factor and records those which cross. Two word bounding boxes will cross upon dilation if the corresponding words are very close to each other on the original document. The text is then grouped into zones using the rule that two words will belong to the same zone if their word bounding boxes cross upon dilation. The text zones thus identified are sorted and returned.</p>
申请公布号 WO2007022460(A2) 申请公布日期 2007.02.22
申请号 WO2006US32483 申请日期 2006.08.18
申请人 DIGITAL BUSINESS PROCESSES, INC.;ROMANOFF, HARRIS;SPERO, LESLIE;SINGH, SARABJIT 发明人 ROMANOFF, HARRIS;SPERO, LESLIE;SINGH, SARABJIT
分类号 G06K9/34 主分类号 G06K9/34
代理机构 代理人
主权项
地址