摘要 |
Apparatus and method for summarizing an original large data set with a representative data set. The data elements in both the original data set and the representative data set have the same variables, but there are significantly fewer data elements in the representative data set. Each data element in the representative data set has an associated weight, representing the degree of compression. There are three steps for constructing the representative data set. First, the original data elements are partitioned into separate bins. Second, moments of the data elements partitioned in each bin are calculated. Finally, the representative data set is generated by finding data elements and associated weights having substantially the same moments as the original data set.
|