摘要 |
<p>The present invention concerns text compression apparatus which comprises means for splitting a main character string into component strings; means for counting the frequency of occurrence of each component string in the main character string and ordering the component strings in their frequency of occurrence; means for allocating to each component string a token value representative of the component string and determined by the frequency of occurrence of the component string; means for storing the token values so allocated as a token table; and means for allocating to each component string in the main character string the token value for that component string from the token table to generate a sequence of token values representing the main character string in a compressed format.</p> |