摘要 |
A document image processing device includes a region dividing unit that divides a document image into sentence regions, a character recognizing unit that recognizes characters in each sentence region obtained by the region dividing unit, a classifying unit that classifies the sentence regions into groups based on first character sizes and first line spacings, a translation unit that translates the characters constituting a character string in each sentence region, a calculating unit that calculates second character sizes and second line spacings, and a correcting unit that corrects the second character sizes and the second line spacings of the sentence regions classified into a same group by the classified unit so that differences in second character size and second line spacing between the sentence regions of the same group is substantially equal to or less than predetermined values.
|