摘要 |
PROBLEM TO BE SOLVED: To provide a document management system capable of improving a recognition (written contents understanding) rate by performing OCR processing with a different language at every zone when a plurality of language zones are mixed in the same image. SOLUTION: The document management system comprises a zone discrimination module 2 for discriminating each image information zone of a character, a chart, and a picture, and a photograph read into a memory; a line discrimination module 3 for dividing a character zone into a line zone; a language discrimination module 4 for discriminating languages for the character zone and line zone, an OCR module 1 for performing OCR processing with language dictionaries corresponding to various languages, and a whole sentence retrieval engine 5 for having a whole sentence retrieving function corresponding to the various languages. The document management system performs language discrimination for every character zone, and performs OCR recognition with a plurality of languages in the same image. COPYRIGHT: (C)2006,JPO&NCIPI
|