摘要 |
Compression of extensive, rule-based grammars used to facilitate search queries is provided herein. Rule-based grammars includes a list of rules that each comprise a sequence of token classes. Each token class is a logical grouping of tokens, and each token is a string of characters. A grammar is parsed to identify rules and token classes. Unimportant token classes are identified and sets of unimportant token classes are merged to generated merged token classes. A compressed grammar is generated by substituting the merged token classes into the grammar for corresponding unimportant token classes used to generate the merged token classes.
|