摘要 |
Methods, systems and computer readable media for removing labeling-bias factors affecting data from two or more data sources after single source biasing factors have been removed to the extent possible. Respective data points from the data sources are considered in combination to generate a population of data points. The population of data points is subdivided into portions of the overall population and, for each portion, the data points are sorted within that portion, relative to values of all other data values in that portion. A function is then generated for each portion from the sorted data points for that portion. For each portion, a value representative of highest population density of data points within that portion is identified. The identified values are fitted to a predetermined curve, and values of all data points are adjusted relative to the fitted values.
|