摘要 |
A method (100) for carrying out a predictive analysis is provided which generates a predictive model (Padj (Y | X)) based on two separate pieces of information, namely - a set of original training data (Dorig), and - a "true" distribution of indicators (Ptrue(X)). The method (100) begins by generating a base model distribution (Pgen(Y | X)) from the original training data set (Dorig) containing tuples (x, y) of indicators (x) and corresponding labels (y) (step 120). Using the "true" distribution (Ptrue(X)) of indicators, a random data set (D') of indicator records (x) is generated reflecting this "true" distribution (Ptrue(X)) (step 140). Subsequently, the base model (Pgen(Y | X)) is applied to said random data set (D'), thus assigning a label (y) or a distribution of labels to each indicator record (x) in said random data set (D') and generating an adjusted training set (Dadj) (step 150). Finally, an adjusted predictive model (Padj (Y | X)) is trained based on said adjusted training set (Dadj) (step 160). |