发明名称 Method and apparatus for pruning side information including directed edges not possessing shortest expanded length for grammar-based compression
摘要 A computer-implemented method for generating side information for grammar-based data compression systems, such as YK compression systems, is described. An admissible grammar (G) for an input sequence (A(S0)) having a finite set of terminal symbols is obtained. A graph representation of the admissible grammar (G) is then constructed. An edge having a lowest weight (expansion frequency), or one not possessing the shortest distance and or shortest expanded sequence length, is then pruned from the graph representation to generate a pruned graph representation. A pruned grammar (G′) is then derived by removing the occurrence corresponding to the pruned edge from the grammar G and the starting variable (S0,i) of the pruned grammar (Gi) is then expanded to generate the side information.
申请公布号 US9026427(B2) 申请公布日期 2015.05.05
申请号 US200912609461 申请日期 2009.10.30
申请人 BlackBerry Limited 发明人 Nguyen Nguyen;Yang En-hui
分类号 G06F17/27;H03M7/00;G06F17/30;H03M7/30 主分类号 G06F17/27
代理机构 Borden Ladner Gervais LLP 代理人 Anishchenko Alexander;Borden Ladner Gervais LLP
主权项 1. A computer-implemented method for generating side information for grammar-based data compression systems, the method comprising: obtaining an admissible grammar (G) for an input sequence (A(S0)) having a finite set of terminal symbols, the admissible grammar having a finite set of variables (S(j)), including a starting variable (S0) representing the input sequence (A(S0)), and a production rule for each variable in the finite set of variables (S(j))); constructing a graph representation of the admissible grammar (G), the graph representation including nodes for each variable in the finite set of variables and each terminal symbol in the finite set of terminal symbols, including a root node representing the starting variable (S0) and directed edges linking the nodes, each directed edge being directed to a node and representing an instance of a variable or a terminal symbol corresponding to the node to which it is directed in the admissible grammar (G) as defined by the production rules, and each directed edge having an expansion frequency, wherein the expansion frequency of directed edges originating from the root node has a value of ‘1’, and expansion frequencies of directed edges emanating from each non-root node are determined in accordance with summing of expansion frequencies of directed edges input to the non-root node; pruning, from the graph representation, a directed edge having a lowest expansion frequency to generate a pruned graph representation; deriving a pruned grammar (Gi); expanding a starting variable (S0,i) of the pruned grammar (Gi) to generate the side information; and storing the side information in a computer memory device; wherein constructing the graph representation of the admissible grammar further comprises determining, for each non-terminal node, a shortest path from the root node to the non-terminal node, and assigning a shortest distance (SD) value to each non-terminal node in accordance with its respective shortest path; wherein, when two or more directed edges are identified as having the same lowest expansion frequency, pruning the directed edge having the lowest expansion frequency comprises applying tie-breaking rules to select the directed edge to prune; wherein the tie-breaking rules comprise at least one of selecting the one of the two or more directed edges with the greatest shortest distance (SD) value, selecting the one of the two or more directed edges leading to a node with the shortest expanded sequence length.
地址 Waterloo, Ontario CA