摘要 |
Techniques are disclosed for generating rules for classifying structured documents, and for classifying, retrieving, or checking structured documents, using generated rules. In one example, a method for generating rules for classifying a plurality of electronic structured documents to which a same schema is applied comprises a computer performing the following steps: determining one or more variable portions defined by the schema by scanning the schema; acquiring respective feature values of the determined variable portions from each of the plurality of structured documents and associating the structured document, from which the feature values are acquired, with the acquired feature values; and generating the rules on the basis of the feature values associated with the structured document. |