摘要 |
PURPOSE: To accurately analyze layout of lines and paragraphs even if a line is broken into multiple lines in the middle of the line or multiple lines exist in parentheses, in a document image. CONSTITUTION: This device is provided with a character candidate element generating part 104 which generates the character candidate element from black pixel connecting components in the document image, a lateral direction line rectangle generating part 105 which makes a plurality of character candidate elements, having the displacement in a vertical direction for a line direction lower than a threshold, as a line candidate element for a plurality of the character candidate elements arranged in the line direction, and a lateral paragraph generating part 106 which makes a plurality of line candidate elements having approximately same length as a paragraph candidate element for the vertical direction.
|