主权项 |
1. A method for classification, the method comprising:
identifying, by a computer system, in one or more import records, a plurality of attribute-value pairs each having: an attribute label not found in a native schema; and a value; identifying, by the computer system, one or more attribute labels in the native schema having as possible values one or more values corresponding to values of the plurality of attribute-value pairs; generating, by the computer system, one or more normalization rules relating the one or more attribute labels of the plurality of attribute-value pairs to at least a portion of the one or more attribute labels in the native schema; normalizing a plurality of non-normalized records according to the one or more normalization rules to generate a plurality of provisionally normalized records; transmitting the plurality of provisionally normalized records to a crowdsourcing forum; receiving, from the crowdsourcing forum, one or more favorable validation decisions with respect to a first portion of the plurality of provisionally normalized records; receiving, from the crowdsourcing forum, one or more unfavorable validation decisions with respect to a second portion of the plurality of provisionally normalized records; identifying one or more first normalization rules from the one or more normalization rules, the one or more first normalization rules corresponding to the first portion of the plurality of provisionally normalized records; and adding the one or more first normalization rules to a validated rule set. |