发明名称 TABLE BOUNDARY DETECTION IN DATA BLOCKS FOR COMPRESSION
摘要 Data is converted into a minimized data representation using a suffix tree by sorting data streams according to symbolic representations for building table boundary formation patterns. The converted data is fully reversible for reconstruction while retaining minimal header information. A scanning operation is performed by searching a suffix of each of the sorted data streams for identifying a data sequence that includes a first symbol representing textual data, and a second symbol representing numerical data. The suffix tree for the converted data is then built.
申请公布号 US2015379068(A1) 申请公布日期 2015.12.31
申请号 US201514847478 申请日期 2015.09.08
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 AMIT Jonathan;DEMIDOV Lilia;HALOWANI Nir
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A system for identifying table boundaries in data blocks for compression in a computing environment, the system comprising: a processor device, operable in the computing environment, wherein the processor device: converts data into a minimized data representation using a suffix tree by sorting data streams according to a plurality of symbolic representations for building table boundary formation patterns, wherein the converted data is fully reversible for reconstruction while retaining minimal header information; andperforms a scanning operation according to each of the following: searches a suffix of each of the sorted data streams for identifying a data sequence that includes a first symbol representing textual data and a second symbol representing numerical data, andbuilds the suffix tree for the converted data.
地址 Armonk NY US