发明名称 FAST INCREMENTAL COLUMN STORE DATA LOADING
摘要 A database is partitioned into a plurality of sorted runs, wherein each sorted run includes a set of sorted segments of data records. One of the sorted runs preferably includes more than half of the records of the database, and the other sorted runs are progressively smaller. A query is processed by searching each of the sorted runs. Writes are effected by appending a new sorted run to the database. Sorted merges are used to combine the smaller sorted runs. Deletions are effected by marking the deleted record in the sorted run. Modifications are effected by deleting the original record and writing the modified record to the database. The larger sorted runs are only re-sorted or merged when the sorted run includes a substantial number of deletions. Two merge queues are maintained to enable rapid merges of the smaller sorted runs while a merger of larger sorted runs are occurring.
申请公布号 US2017046394(A1) 申请公布日期 2017.02.16
申请号 US201615149155 申请日期 2016.05.08
申请人 MemSQL, Inc. 发明人 SKIDANOV Alex;Papitto Anders J.;Prout Adam
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A non-transitory computer readable medium that includes a program that, when executed on a processing system, causes the processing system to: partition a database into a plurality of segments, each segment comprising a plurality of records that are column stored, and each segment being sorted with respect to a key value and characterized by a minimum key value and a maximum key value that define a range of the segment; and partition the plurality of segments into a plurality of sorted runs, wherein the range of each segment in each sorted run does not overlap the range of any other segment in the sorted run; wherein a largest sorted run of the plurality of sorted runs is substantially larger than a smallest sorted run of the plurality of sorted runs, and the plurality of sorted runs includes other sorted runs of sizes between the smallest and the largest; wherein in response to a query for a target key value, the program causes the processor to: identify each target segment that is to be searched based on the target key value and the range of each segment in each sorted run;search each of the target segments for the target key value; andprovide a result to a user based on the search.
地址 San Francisco CA US