摘要 |
PROBLEM TO BE SOLVED: To provide equipment, a method, and a program for character recognition which can improve a recognition rate of hand-written letters in documents in which printed characters and hand-written characters mixedly exist. SOLUTION: An image input part 11 creates an input image of a document in which printed characters and hand-written characters mixedly exist, and a binarization part 12 binarizes the input image. A document structure analysis part 14 separates the binarized image into two or more sentence areas based on paragraphs and columns, and blocks them. A character cut-out part 15 performs a character cut-out to each sentence area, character by character. A characteristic amount calculation part 16 calculates a characteristic amount for each sentence area, using a result of a sentence structure analysis and a character cut-out. A characteristic amount accumulation part accumulates the characteristic amount, and finds a separation factor separating printed characters and hand-written characters. A printed and hand-written characters separation part 18 separates printed characters and hand-written characters in the input image and an output image of the document structure analysis part 14 based on the separation factor, and obtains extraction image for each of them. COPYRIGHT: (C)2006,JPO&NCIPI
|