摘要 |
Methods, systems and machine-readable instructions for processing an electronic document are described. In one aspect, logical blocks that were extracted from the electronic document, including a text block comprising text lines each encompassed by a respective bounding rectangle, are received. Edges of ones of the bounding rectangles are extended to at least one boundary without changing layout relationships among the logical blocks in the electronic document. A text layout boundary is generated from extended and unextended edges of the bounding rectangles. A description of the text layout boundary is stored in a machine-readable medium.
|