发明名称 High bandwidth parsing of data encoding languages
摘要 A mechanism is provided for accelerating data exchange language parsing. An input data stream is loaded into a first in, first out (FIFO) memory. A tokenization bit corresponding to a next byte to be read is extracted from a FIFO. A determination is made as to whether the tokenization bit corresponding to the next byte to be read from the FIFO indicates a control character or a non-control character located in an associated FIFO memory location in the FIFO. Responsive to the tokenization bit indicating the control character, the control character that causes a state change in a state machine is processed. Responsive to the tokenization bit indicating the non-control character, a length associated with the tokenized bit is identified and a set of non-control characters that do not cause a state change in the state machine are processed based on the length associated with the tokenized bit.
申请公布号 US8903715(B2) 申请公布日期 2014.12.02
申请号 US201213464384 申请日期 2012.05.04
申请人 International Business Machines Corporation 发明人 Agarwal Kanak B.
分类号 G06F17/28;G06F17/20;G06F17/27;G06F17/21;G10L21/00 主分类号 G06F17/28
代理机构 代理人 Lammes Francis;Walder, Jr. Stephen J.;Stock William J.
主权项 1. A method, in a data processing system, for accelerating data exchange language parsing, the method comprising: loading, by a processor, an input data stream into a first in, first out (FIFO) memory; extracting, by the processor, a tokenization bit corresponding to a next byte to be read from the FIFO; determining, by the processor, whether the tokenization bit corresponding to the next byte to be read from the FIFO indicates a control character or a non-control character located in an associated FIFO memory location in the FIFO; responsive to the tokenization bit indicating the control character located in the associated FIFO memory location in the FIFO, processing, by the processor, the control character, wherein the control character causes a state change in a state machine and wherein processing the control character increments the FIFO read pointer by one space; and responsive to the tokenization bit indicating the non-control character located in the associated FIFO memory location in the FIFO, identifying, by the processor, a length associated with the tokenized bit and processing, by the processor, a set of non-control characters based on the length associated with the tokenized bit, wherein processing the set of control characters increments the FIFO read pointer based on the length associated with the tokenized bit and wherein the set of non-control characters do not cause a state change in the state machine.
地址 Armonk NY US