发明名称 |
METHOD AND DEVICE FOR ACQUIRING STRUCTURED INFORMATION IN LAYOUT FILE |
摘要 |
<p>The present application discloses a method and an apparatus for obtaining structured information in a fixed layout document to improve the structuring speed for information management of a fixed layout document. The method may comprise: determining initial page number information corresponding to current directory entry of the document; segmenting first article content of a page corresponding to the initial page number information into at least one structured-characters-block; searching in each structured-characters-block for a first structured-characters-block which matches with name strings of the current directory entry, and obtaining first position information about where the first structured-characters-block is located in the first article content; and obtaining initial position information of the current directory entry and end position information of the previous directory entry based on the first position information.</p> |
申请公布号 |
EP2790111(A1) |
申请公布日期 |
2014.10.15 |
申请号 |
EP20120855138 |
申请日期 |
2012.12.07 |
申请人 |
PEKING UNIVERSITY FOUNDER GROUP CO., LTD;BEIJING FOUNDER APABI TECHNOLOGY LIMITED |
发明人 |
DONG, NING;HUANG, WENJUAN;ZHANG, BAOLIANG |
分类号 |
G06F17/21;G06F17/22;G06F17/27;G06K9/00 |
主分类号 |
G06F17/21 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|