摘要 |
Efficient data modeling utilizing sparse representation of a data set. In one embodiment, a computer-implemented method such that a data set is first input. The data set has a plurality of records. Each record has at least one attribute, where each attribute has a default value. The method stores a sparse representation of each record, such that the value of each attribute of the record is stored only if the value of the attribute varies from the default value. A data model is then generated, utilizing the sparse representation, and the model is output. The generation of the data model in one embodiment is in accordance with the Expectation Maximization (EM) algorithm.
|