摘要 |
PROBLEM TO BE SOLVED: To extract a title and caption from a scanned image. SOLUTION: Thresholds of a plurality of levels are set to a gray scale image to obtain a plurality of binary images, and then all the connection compositions in each binary image are identified and are clustered. A possible title region in each binary image is identified to integrate possible title regions. Non-title regions are eliminated from the possible title regions having been identified previously, by comparing the properties of the identified title regions with the criterion having been specified previously, and extraction of the text from the title region is thus carried out. |