发明名称 GENERATING RULES FOR CLASSIFYING STRUCTURED DOCUMENTS
摘要 Techniques are disclosed for generating rules for classifying structured documents, and for classifying, retrieving, or checking structured documents, using generated rules. In one example, a method for generating rules for classifying a plurality of electronic structured documents to which a same schema is applied comprises a computer performing the following steps: determining one or more variable portions defined by the schema by scanning the schema; acquiring respective feature values of the determined variable portions from each of the plurality of structured documents and associating the structured document, from which the feature values are acquired, with the acquired feature values; and generating the rules on the basis of the feature values associated with the structured document.
申请公布号 US2012109960(A1) 申请公布日期 2012.05.03
申请号 US201113274988 申请日期 2011.10.17
申请人 MISHINA TAKUYA;TAKASE TOSHIRO;INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 MISHINA TAKUYA;TAKASE TOSHIRO
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址