摘要 |
<P>PROBLEM TO BE SOLVED: To provide a method, computer and computer program for producing a rule for efficiently classifying structured documents such as XML documents. <P>SOLUTION: A method for producing a rule for classifying a plurality of digitized structured documents to which the same schema is applied is provided. The method includes the steps of scanning the schema to identify one or more variable portions defined by the schema, acquiring feature values of the identified variable portions from each of the plurality of structured documents and associating each of the acquired feature values with the structured document from which the feature value is acquired, and producing the rule on the basis of the feature values associated with the structured documents. Furthermore, a computer for producing a rule for specifying a plurality of digitized structured documents to which the same schema is applied, and a computer program therefor are provided. <P>COPYRIGHT: (C)2012,JPO&INPIT |