System for extracting attached text from a table-cell frame,申请号EP19970304087-传众专利搜索

发明名称	System for extracting attached text from a table-cell frame
摘要	<p>A method for identifying and extracting text data from a table-cell frame. The method includes the steps of tracing connected components of a document image, tracing white contours within a connected component, defining a frame outline based on the white contours, identifying unattached character data inside the frame outline, and defining an initial rectangular area inside the frame outline. The method further includes detecting black pixels in a horizontal or vertical direction from the initial rectangular area in order to create an extended character area, locating boundary pixels lying inside the extended character area for each white contour, identifying black pixels positioned between boundary pixels lying inside the extended character area, combining black pixels positioned between boundary pixels lying inside the extended character area so as to form at least one connected component, recognizing the at least one connected component as a text component if it is not recognized as a vertical line, as a horizontal line, as part of a broken line, or as part of the frame, and defining a character node of a hierarchical tree structure corresponding to the extended character area and containing both the at least one connected component and any identified unattached connected components. <IMAGE></p>
申请公布号	EP0814422(A2)	申请公布日期	1997.12.29
申请号	EP19970304087	申请日期	1997.06.11
申请人	CANON KABUSHIKI KAISHA	发明人	SHIN-YWAN, WANG
分类号	G06K9/20;G06K9/34;G06T11/60;(IPC1-7):G06K9/20	主分类号	G06K9/20
代理机构		代理人
主权项
地址