摘要 |
PROBLEM TO BE SOLVED: To provide an image processing apparatus generating a PDF file which performs a character recognition processing in a high image quality and a high compression rate. SOLUTION: Out of characters extracted by an extraction means for extracting characters from an image, partial area characters included in partial areas of the image expressed only in a single color and lighter in color than the corresponding partial area are retrieved. For characters not retrieved, first binary image information consisting of only characters having the same color as that of the characters not retrieved is generated. For the partial areas including the retrieved partial area characters, when partial area images from which the partial area characters have been eliminated can be expressed only in a single color, second binary image information consisting of the partial area images and characters having the same color as that of the partial area images is generated. Using the generated first binary image information and second binary image information, multivalued image information is generated by erasing the areas extracted as the binary images using the peripheral colors of those areas, and character information indicating only the characters extracted by the extraction means is generated. COPYRIGHT: (C)2009,JPO&INPIT
|