摘要 |
A computer implemented method comprising:analysing a dataset comprising a plurality of data records, wherein each data record comprises data fields, a data field comprising a field name and a field value;receiving a request to provide a predicted field value for a certain target field of a certain target data record, wherein said certain target data record comprises one or more explanatory fields with explanatory values;determining univariate counts indicative of value variation in said target field and in one or more explanatory fields across the dataset; determining bivariate counts indicativeof value pair variation in field pairs comprising said target field and at least one of the explanatory fields across the dataset;using the univariate counts and bivariate counts for determining data record signatures for different values of the target field, wherein the signature comprises values of the explanatory fields;repeating said determining of signatures until certain predefined limit is reached; selecting a signature that at least partially matches values of explanatory fields of the target data record; andconcluding that the predicted field value for the target field is the value of the target field corresponding to the selected signature. |