发明名称 HISTORY PRESERVING DATA PIPELINE
摘要 A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
申请公布号 US2016125000(A1) 申请公布日期 2016.05.05
申请号 US201514879916 申请日期 2015.10.09
申请人 Palantir Technologies, Inc. 发明人 Meacham Jacob;Harris Michael;Brodman Gustav;Cuthriell Lynn;Korus Hannah;Toth Brian;Hsiao Jonathan;Elliot Mark;Schimpf Brian;Garland Michael;Nguyen Evelyn
分类号 G06F17/30;G06F11/14 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method comprising: at one or more computing devices comprising one or more processors and one or more storage media storing one or more computer programs executed by the one or more processors to perform the method, performing operations comprising: maintaining a build catalog comprising a plurality of build catalog entries; wherein each build catalog entry, of the plurality of build catalog entries, comprises: an identifier of a version of a derived dataset corresponding to the build catalog entry,one or more dataset build dependencies of the version of the derived dataset corresponding to the build catalog entry, each of the one or more dataset build dependencies comprising an identifier of a version of a child dataset from which the version of the derived dataset corresponding to the build catalog entry is derived, anda derivation program build dependency of the version of the derived dataset corresponding to the build catalog entry, the derivation program build dependency comprising an identifier of a version of a derivation program executed to generate the version of the derived dataset corresponding to the build catalog entry; creating a new version of a particular derived dataset in context of a successful transaction; and adding a new build catalog entry to the build catalog, the new build catalog entry comprising an identifier of the new version of the particular derived dataset, the identifier of the new version of the particular derived dataset being a transaction commit identifier assigned to the successful transaction.
地址 Palo Alto CA US