发明名称 Separation of touching characters in optical character recognition
摘要 <p>Method and apparatus for separating touching characters within an optical character recognition (OCR) computer (1). An input document (20) is scanned by scanner (2), forming a set of scan lines (3). A segmentation process (4) is performed on the scan lines (3) to create a set of segmented image boxes (5). Candidate characters within the image boxes (5) are classified by a classification module (6), based upon a library of stored models (7). When the candidate characters have high degree of confidence, they are classified and coded into a binary form (8), such as ASCII. Those candidate characters that are not classified are processed by a touching character decision module (9) to determine whether a series of separation modules (10-14) is to be invoked. The execution of modules (10-13), followed by the reexecution of modules (4) and (6), may or may not cause all of the touching characters to be separated. Any touching characters that remain are subjected to one or more reprocessing cycles. The reprocessing can entail examination (14) of adjacent scan lines (3), shifting of separation threshold T by separation threshold determination module (10), or re-execution of deconvolution step (12) with changed parameters or structure. &lt;IMAGE&gt;</p>
申请公布号 EP0780782(A2) 申请公布日期 1997.06.25
申请号 EP19960308353 申请日期 1996.11.19
申请人 CANON KABUSHIKI KAISHA 发明人 JAMALI, HAMADI
分类号 G06K9/34;(IPC1-7):G06K9/34 主分类号 G06K9/34
代理机构 代理人
主权项
地址