发明名称 |
Distributed storage of aggregated data |
摘要 |
Techniques are described for managing aggregation of data in a distributed manner, such as for a particular client based on specified configuration information. The described techniques may include storing aggregated data values for an OLAP cube or other data structure in a distributed manner, such as in some situations in a distributed hash table. The aggregated data values to be stored may be generated in various manners, such as by performing multi-stage data manipulation operations—for example, a map-reduce architecture may be used, with a first stage involving the use of one or more specified map functions to be performed, and with at least a second stage involving the use of one or more specified reduce functions to be performed. |
申请公布号 |
US8938416(B1) |
申请公布日期 |
2015.01.20 |
申请号 |
US201213350653 |
申请日期 |
2012.01.13 |
申请人 |
Amazon Technologies, Inc. |
发明人 |
Cole Richard J.;Mock Alan D. |
分类号 |
G06F17/00;G06F17/30 |
主分类号 |
G06F17/00 |
代理机构 |
Seed IP Law Group PLLC |
代理人 |
Seed IP Law Group PLLC |
主权项 |
1. A computer-implemented method comprising:
generating, by one or more configured computing nodes of a data aggregation service, an OLAP (“online analytical processing”) cube having a plurality of aggregated data values, wherein each of the aggregated data values is associated with a combination of multiple dimension category values for multiple dimensions of the OLAP cube, and wherein the generating of the OLAP cube is performed using configuration information specified by a client of the data aggregation service and includes performing multiple stages having distinct indicated activities; and after the generating of the OLAP cube, initiating, by the one or more configured computing nodes, storing the plurality of aggregated data values for the OLAP cube in a distributed hash table having a plurality of storage locations on multiple storage nodes that are each available to store information for a key-value pair by, for each of the plurality of aggregated data values: determining a hash key value associated with the aggregated data value based at least in part on the combination of multiple dimension category values associated with the aggregated data value; determining a storage location within the distributed hash table by using the determined hash key value as input to a hash function, the determined storage location being one of the plurality of storage locations and being on one of the multiple storage nodes; and providing the aggregated data value for storage in the determined storage location on the one storage node as part of the OLAP cube. |
地址 |
Reno NV US |