发明名称 METHOD OF INTERPRETING DOCUMENT IMAGE AREA
摘要 PURPOSE: A method of interpreting a document image area is provided to extract connected components to group the connected components as tree structures according to spacial relations, and to readjust the components in a text area via separating/combining procedures, thereby efficiently interpreting a document structure. CONSTITUTION: Connected components are analyzed through a reduced document image(61, 62). A tree is generated by an analyzed result of the connected components, to classify the connected components(63, 64). Text factors are grouped according to spacial relations from the classified connected components. A text block is readjusted through separation/combination procedures of the connected components. The step of generating the tree and classifying the connected components comprises the steps as follows. The tree is constructed from types of the connected components. Connected components including tables, frames, and pictures are grouped as independent nodes with text. Connected components within a text block surrounded by margins are grouped. Nodes which are not grouped are classified by areas of the connected components.
申请公布号 KR20020055454(A) 申请公布日期 2002.07.09
申请号 KR20000083420 申请日期 2000.12.28
申请人 ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE 发明人 CHO, SU HYEON;HWANG, YEONG SEOP;JANG, DAE GEUN;JI, SU YEONG;JUNG, YEON GU;MUN, GYEONG AE
分类号 G06T7/40;G06K9/20;(IPC1-7):G06T7/40 主分类号 G06T7/40
代理机构 代理人
主权项
地址