摘要 |
<p>Detection of table of contents entries in a fixed format document for reconstruction of table of contents entries in a flow format document is provided. One or more table of contents entries are detected in a fixed format document, and table of contents entry candidates are generated by grouping one or more lines containing suspected table of contents entries. Each grouping is compared to text contained in the fixed format document for locating matching headings, subheadings, and associated text in the fixed format document. After non-matching or false positive matches are discarded, headings found in the fixed format document matching headings contained in table of contents entry candidates are used to reconstruct table of contents entries in a table of contents page, area or section in a reconstructed flow format document.</p> |