发明名称 FILTERING DATA LINEAGE DIAGRAMS
摘要 Managing lineage information includes processing a specification of a directed graph to associate nodes with information for processing requests for a representation of data lineage. The processing includes: identifying a first set of one or more nodes of the directed graph corresponding to normalizing data elements being stored in a data store and de-normalizing data elements being retrieved from the data store; and associating a first plurality of nodes connected to the first set of one or more nodes and a second plurality of nodes connected to the first set of one or more nodes with at least one tag identifier having a plurality of possible tag values, where the number of possible tag values is at least as large as the number of data elements being normalized, and where nodes representing different data elements in a de-normalized record are associated with different values of the tag identifier.
申请公布号 US2016232230(A1) 申请公布日期 2016.08.11
申请号 US201615040162 申请日期 2016.02.10
申请人 Ab Initio Technology LLC 发明人 Radivojevic Dusan
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method for managing lineage information in a computing system, the method including: storing, in a data storage system, a specification of a directed graph that includes a plurality of nodes representing computation, and a plurality of nodes representing data elements received or produced by a computation during execution of the computation, and directed links between nodes representing lineage relationships between a computation and a data element; processing, using at least one processor, the specification to associate nodes with information for processing requests for a representation of data lineage, the processing including: identifying a first set of one or more nodes of the directed graph corresponding to normalizing data elements being stored in a data store and de-normalizing data elements being retrieved from the data store, where normalizing data elements includes transforming a record corresponding to multiple data elements into multiple records that have a common format for at least one field, and where de-normalizing data elements includes transforming multiple records that have a common format for at least one field into a single record corresponding to multiple data elements; andassociating a first plurality of nodes connected to the first set of one or more nodes by a first directed link representing a first lineage relationship and a second plurality of nodes connected to the first set of one or more nodes by a second directed link representing a second lineage relationship with at least one tag identifier having a plurality of possible tag values, where the number of possible tag values is at least as large as the number of data elements being normalized, and where nodes representing different data elements in a de-normalized record are associated with different values of the tag identifier.
地址 Lexington MA US