摘要 |
The system has a scanner to capture the document content, a computer to process digital data, and hard disc storage for the collected data. The computer is connected to the scanner, and includes a circuit to extract (10,11) characteristics of the document and direct the signals to a first alphanumeric recognition system. The recognition system has a dictionary of character data held in memory. A second part of the system assembles the characters into words, and is connected to a module that selects from indexing tables that generate a series of structured keywords. Another module compresses images taken from the document. When the user enters search words they are compared to the keywords that have been extracted from the document.
|