发明名称 Processing related datasets
摘要 Processing related datasets includes receiving over an input device or port records from multiple datasets, the records of a given dataset having one or more values for one or more respective fields; and processing records from each of the multiple datasets in a data processing system. The processing includes: analyzing at least one constraint specification stored in a data storage system to determine a processing order for the multiple datasets, the constraint specification specifying one or more constraints for preserving referential integrity or statistical consistency among a group of related datasets that includes the multiple datasets, applying one or more transformations to records from each of the multiple datasets in the determined processing order, where the transformations are applied to records from a first dataset of the multiple datasets before the transformations are applied to records from a second dataset of the multiple datasets, and the transformations applied to the records from the second dataset are applied based at least in part on results of applying the transformations to the records from the first dataset and at least one constraint between the first dataset and the second dataset specified by the constraint specification, and storing or outputting results of the transformations to the records from each of the multiple datasets.
申请公布号 US8775447(B2) 申请公布日期 2014.07.08
申请号 US201113166365 申请日期 2011.06.22
申请人 Ab Initio Technology LLC 发明人 Roberts Andrew F.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A method for processing related datasets, the method including: receiving over an input device or port records from multiple datasets, the records of a given dataset having one or more values for one or more respective fields; and processing records from each of the multiple datasets in a data processing system, the processing including analyzing at least one constraint specification stored in a data storage system to determine a processing order for the multiple datasets, the constraint specification specifying one or more constraints for preserving referential integrity or statistical consistency among a group of related datasets that includes the multiple datasets,applying one or more transformations to records from each of the multiple datasets in the determined processing order, where the transformations are applied to records from a first dataset of the multiple datasets before the transformations are applied to records from a second dataset of the multiple datasets, and the transformations applied to the records from the second dataset are applied based at least in part on results of applying the transformations to the records from the first dataset and at least one constraint between the first dataset and the second dataset specified by the constraint specification, andstoring or outputting results of the transformations to the records from each of the multiple datasets.
地址 Lexington MA US