摘要 |
A computer-implemented method of indexing a multi-page electronic document. The method involves image scanning a physical document to produce the electronic document, and performing text extraction, for example optical character recognition, on the electronic document to produce computer readable text. Each page of the electronic document is classified in respect of category types by applying classification rules to the computer readable text. Metadata is associated with each page to indicate the assigned category types. A computer usable index is created using the metadata, the index organising the pages according to category type. |