发明名称 Separation of data chunks into multiple streams for compression
摘要 For on-line separation of data chunks for compression, unrelated data chunks are classified based on various attributes. The classified data chunks are sent to at least one available compression contexts. The classified data chunks are related. The classified data chunks are encoded by at least one the compression operations. A compression ratio is achieved and included as feedback.
申请公布号 US8782019(B2) 申请公布日期 2014.07.15
申请号 US201213535023 申请日期 2012.06.27
申请人 International Business Machines Corporation 发明人 Amit Jonathan;Shalev Ori
分类号 G06F7/00;G06F17/00 主分类号 G06F7/00
代理机构 Griffiths & Seaton PLLC 代理人 Griffiths & Seaton PLLC
主权项 1. A method for on-line separation of data chunks for compression by a processor device in a computing storage environment, the method comprising: classifying unrelated data chunks based on one of a plurality of attributes into related classified data chunks, wherein at least one of the plurality of attributes includes at least one of a plurality of data attributes and a plurality of meta-data attributes, and the at least one of the plurality of data attributes includes at least one of known data formats, a character distribution, and similarity between previously classified data chunks provided from the feedback and the unrelated data chunks currently being analyzed for the classifying; sending the classified data chunks into at least one of a plurality of available compression contexts; encoding the classified data chunks by at least one of a plurality of compression operations, wherein a compression ratio is achieved and included as feedback; based on the feedback, resending one or more of the classified data chunks to an alternative one of the at least one of the plurality of available compression contexts for handling the classified data chunks; sending a notification to the at least one of the plurality of available compression contexts to ignore the classified data chunks and that the alternative one of the at least one of the plurality of available compression contexts is handling the one or more classified data chunks; and in conjunction with the classifying, selecting an existing one of the at least one of the plurality of available compression contexts and creating a new one of the at least one of the plurality of available compression contexts.
地址 Armonk NY US