摘要 |
Described is a technology by which high dimensional source data corresponding to rows of records with identifiers, and columns comprising dimensions of data values, are processed into a file model for efficient access. An inverted index corresponding to any dimension is built by mapping data from raw dimension values to mapped values based on mapping entries in a dimension table. The record identifiers are arranged into subgroups based on their mapped value; a count and/or an offset may be maintained for locating each of the subgroups. The raw values for a dimension are maintained within a raw value file. For sparse data, the raw value file may be compressed, e.g., by excluding nulls and associating a record identifier with each non-null. A data manager provides access to data in the data files, such as by offering various functions, using caching for efficiency.
|