摘要 |
<p>PURPOSE:To highly accurately structure the documtent areas of vertically written and horizontally written documents and to extract a correct reading order. CONSTITUTION:An area extraction means 102 extracts the areas of character areas and drawing areas, etc., from binary pictures, a sentence area classification means 103 performs classification into drawing titles, titles, headers, footers and the other text areas and a ruled line information generation means 104 generates the imaginary ruled lines of an extracted ruled line area and a white area and the imaginary ruled lines of the end part of the drawing area, etc. A sentence area arrangement structuring means 105 structures the arrangement of the text area and expresses it by a tree graph and a reading order extraction means 106 decides the reading order from the graph expression.</p> |