发明名称 Sorting multiple records of data using ranges of key values
摘要 A method and system for sorting data of an input file containing multiple records associated with multiple tables of a database. The multiple records include key values. The key values are segmented into ranges of key values for each table. Each range of key values for each table is a segment having a segment value. Multiple key values are selected for the multiple records. A block number, which contains a unique permutation of the segment values of the segments, is generated. The segment values denote the ranges of key values encompassing the multiple key values in each record. A sort key value for each record is ascertained, based on the generated block number for each record, and added to each record. The multiple records are sorted according to the sort key values in the multiple records. The sorted multiple records are stored in an output file.
申请公布号 US9213782(B2) 申请公布日期 2015.12.15
申请号 US201414188746 申请日期 2014.02.25
申请人 International Business Machines Corporation 发明人 Boh Ritsuko;Kohno Noriaki
分类号 G06F17/30;G06F7/08 主分类号 G06F17/30
代理机构 Schmeiser, Olsen & Watts, LLP 代理人 Schmeiser, Olsen & Watts, LLP ;Pivnichny John
主权项 1. A method for sorting data of an input file stored on a first tangible storage device, said input file comprising multiple records associated with multiple tables of a database, each record of the multiple records comprising a plurality of key values, said method comprising: segmenting, by a processor of a computer system, the plurality of key values in the multiple records associated with each table into ranges of key values for each table, each range of key values for each table denoted as a segment having an associated segment value; said processor generating, for each record of the multiple records, a block number denoting a unique permutation of the segment values of the segments, said segment values respectively denoting the ranges of key values encompassing multiple key values selected for each record in association with the tables of the multiple tables; said processor ascertaining, for each record of the multiple records, a sort key value based on the generated block number for each record of the multiple records; said processor sorting the multiple records according to the sort key values after adding the sort key value to each record of the multiple records; and said processor storing the sorted multiple records in an output file on a second tangible storage device; wherein the generated block numbers collectively constitute multiple block numbers, wherein the method further comprises sequencing the block numbers of the multiple block numbers in a block sequence such that the segment value differs in only one position within the unique permutation of the segment values in each pair of successive blocks in the block sequence, and wherein said ascertaining the sort key value for each record of the multiple records comprises: converting the generated block number for each record of the multiple records to an ordinal value denoting a sequential position of the generated block number within the block sequence;determining an intra-block key position, within the unique permutation of the segment values of the generated block for each record of the multiple records, as being said only one position at which the segment value differs from the segment value in the block immediately preceding the generated block in the block sequence;determining an intra-block key value as being the key value of the multiple key values of each record of the multiple records at the segment associated with the intra-block key position; andgenerating the sort key value for each record of the multiple records from a combination of the ordinal value and the intra-block key value.
地址 Armonk NY US