发明名称 Data compression algorithm selection and tiering
摘要 A data storage subsystem having a plurality of data compression engines configured to compress data, each having a different compression algorithm. A data handling system is configured to determine a present rate of access to data; select at least one sample of data; determine the greatest degree of compression of said data compression engines; determine the compression ratios of the operated data compression engines with respect to the selected sample(s); compressing said selected at least one sample with a plurality of said data compression engines at said selected tier; operate a selected data compression engines with respect to the selected sample and determine the greatest degree of compression of the data compression engines; compress the data from which the sample was selected with one of the operated data compression engines determined to have the greatest degree of compression; and store the compressed data in data storage repositories.
申请公布号 US9569474(B2) 申请公布日期 2017.02.14
申请号 US201414220790 申请日期 2014.03.20
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 Groseclose Michael R.;Juarez Larry;Montgomery David;Peipelman Jason L.;Rhoades Joshua M.
分类号 G06F17/30;G06F3/06;H03M7/30 主分类号 G06F17/30
代理机构 Griffiths & Seaton PLLC 代理人 Griffiths & Seaton PLLC
主权项 1. A method for selectively compressing data for a data storage system having a plurality of data compression engines, each having a different compression algorithm, comprising the steps of: determining a present rate of access to data; selecting at least one sample of said data; determining a greatest degree of compression of a plurality of data compression engines with respect to said selected at least one sample; compressing said selected at least one sample with a plurality of said data compression engines at a selected tier; operating said selected data compression engines with respect to said selected at least one sample and determining the greatest degree of compression of said data compression engines from said operation of said data compression engines with respect to said selected at least one sample; compressing said data from which said at least one sample was selected with the one of said operated data compression engines determines to have said greatest degree of compression with respect to said selected at least one sample; storing said compressed data in data storage repositories associated with the data compression engine employed to compress said data; when said rate of access indicates said data is to be compressed, selecting a tier of data compression engines with respect to said data that is inverse to said present rate of access; randomly selecting at least one sample of said data to be compressed and stored; determining compression ratios of said data engines from said operation of said data compression engines with respect to said selected at least one sample; arranging said plurality of data compression engines in a plurality of tiers from low to high in accordance with expected latency to compress data and to uncompress compressed data; and moving data between a parent and a child category, wherein at least two of said repositories are classified into parent and child categories, each at a different said tier, said parent having a lesser degree of compression than said child, and said computer program product computer readable program code, when executed on a computer processing system, causes said computer processing system to additionally move data between said parent and said child category repositories in accordance with the inverse of said present rate of access.
地址 Armonk NY US