摘要 |
A method for processing data includes reading respective initial substrings of the strings in a group, and computing respective codewords for the initial substrings. The codewords indicate differences between the substrings and point to the strings from which the substrings were respectively read. The codewords are arranged in a heap, which includes a tree of nodes. Each node has no more than two children and has a respective codeword pointing to a string that is in a predetermined ordinal relation, based on the lexicographical ordering, to the strings pointed to by the codewords of the children of the node. A list of one or more of the strings is output in accordance with a lexicographical ordering by selecting one or more of the nodes in the heap and reading the strings that are pointed to by the corresponding codewords.
|