摘要 |
A data processing method and device in data modeling are used to resolve the problem in the prior art that in a preprocessing procedure of original data, a calculation amount is large, a calculation time is long, calculation resources are wasted, and working efficiency is reduced. The method is: performing, according to a data conversion function corresponding to a preset data processing category identifier, data conversion on a data column corresponding to each feature in read original data, to generate a corresponding extension feature column, and combining extension feature columns corresponding to all features in the original data to generate an extension feature set (202); determining a correlation coefficient in each feature in the extension feature set (203); selecting a feature whose correlation coefficient meets a set condition as an important feature (204); and screening out a data column corresponding to the important feature from the extension feature set (205). The invention addresses the problem of the long time and extensive calculation caused by data modeling using a data enumeration processing method, calculation efficiency is improved, and flexibility and adaptation of automatic data modeling are improved. |