发明名称 EFFICIENT COLUMN BASED DATA ENCODING FOR LARGE-SCALE DATA STORAGE
摘要 The subject disclosure relates to column based data encoding where raw data to be compressed is organized by columns, and then, as first and second layers of reduction of the data size, dictionary encoding and/or value encoding are applied to the data as organized by columns, to create integer sequences that correspond to the columns. Next, a hybrid greedy run length encoding and bit packing compression algorithm further compacts the data according to an analysis of bit savings. Synergy of the hybrid data reduction techniques in concert with the column-based organization, coupled with gains in scanning and querying efficiency owing to the representation of the compact data, results in substantially improved data compression at a fraction of the cost of conventional systems.
申请公布号 WO2010014956(A3) 申请公布日期 2010.06.10
申请号 WO2009US52491 申请日期 2009.07.31
申请人 MICROSOFT CORPORATION 发明人 NETZ, AMIR;PETCULESCU, CRISTIAN;CRIVAT, IOAN, BOGDAN
分类号 G06F7/76;G06F7/78 主分类号 G06F7/76
代理机构 代理人
主权项
地址