发明名称 |
Method for identifying and using table structures |
摘要 |
A method for recognizing a table structure from delineated table region in an electronic document using hierarchical clustering of data strings. The cluster groupings are segregated effectively using the distances from a positional vector associated with words and groups of words rather than a minimum number of blank spaces between words. Once a data tree of the hierarchical clusterings is constructed, the tree is scanned downward from the root to find appropriate column boundaries using a columnization algorithm. Then using successive heuristic algorithms, determine column and row headers and row boundaries.
|
申请公布号 |
US2003097384(A1) |
申请公布日期 |
2003.05.22 |
申请号 |
US20000734057 |
申请日期 |
2000.12.11 |
申请人 |
HU JIANYING;KASHI RAMANUJAN S.;LOPRESTI DANIEL P.;WILFONG GORDON T. |
发明人 |
HU JIANYING;KASHI RAMANUJAN S.;LOPRESTI DANIEL P.;WILFONG GORDON T. |
分类号 |
G06F17/24;G06F17/27;(IPC1-7):G06F17/21 |
主分类号 |
G06F17/24 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|