摘要 |
A computer-implemented method for extracting information from a population of subject documents. The method includes modeling a document structure. The modeled document structure includes at least a document component hierarchy with at least one record type. Each record type includes at least one record part type and at least one record part type comprising at least one data element type. For a subject document exhibiting at least a portion of the modeled document structure, preferred embodiments of the invention identifying data of a type corresponding to at least one modeled data element type. Identified subject document data is then associated with the corresponding modeled data element type.
|