发明名称 COMPRESSION OF SMALL STRINGS
摘要 A method for compressing a set of small strings may include calculating n-gram frequencies for a plurality of n-grams over the set of small strings, selecting a subset of n-grams from the plurality of n-grams based on the calculated n-gram frequencies, defining a mapping table that maps each n-gram of the subset of n-grams to a unique code, and compressing the set of small strings by replacing n-grams within each small string in the set of small strings with corresponding unique codes from the mapping table. The method may use linear optimization to select a subset of n-grams that achieves a maximum space saving amount over the set of small strings for inclusion in the mapping table. The unique codes may be variable-length one or two byte codes. The set of small strings may be domain names.
申请公布号 US2013173676(A1) 申请公布日期 2013.07.04
申请号 US201113339562 申请日期 2011.12.29
申请人 THOMAS MATTHEW;PERROUD BENOIT 发明人 THOMAS MATTHEW;PERROUD BENOIT
分类号 G06F17/10 主分类号 G06F17/10
代理机构 代理人
主权项
地址