摘要 |
A seal impression is removed while keeping character string information when applying OCR to a business document stored in grayscale, even if the character string and the seal impression overlap with each other. The character string that overlaps is extrapolated by matching a character string present near the seal impression against a database. First, a seal impression region in a document inputted in grayscale is removed. Next, character information that is present near the removed seal impression region and of which a portion of the characters is unclear due to the seal impression region is extracted. Then, an attribute of the extracted seal impression related information is identified, a customer database storing character string candidates containing customer information is referred to, and based on the seal impression related information classified by attribute, the character string that overlaps with the seal impression region and that is thus unclear is extrapolated.
|