发明名称 DATA LINEAGE SUMMARIZATION
摘要 An identification of a directed graph is received that includes data transformation nodes that represent computations that transform data elements and one or more data nodes that represent data elements, and includes directed links that represent lineage relationships; and computing summary information based on paths in the directed graph, and storing the summary information in one or more summary objects. The computing includes: receiving designation of interest for a plurality of the nodes of the directed graph; and generating one or more summary objects for remaining nodes not included in the plurality of nodes of interest, a first summary object including summary information based on a first path between a first node of interest and a second node of interest that does include one or more of the remaining nodes and does not include any nodes of interest other than the first and second nodes.
申请公布号 US2016028580(A1) 申请公布日期 2016.01.28
申请号 US201514805616 申请日期 2015.07.22
申请人 Ab Initio Technology LLC 发明人 Radivojevic Dusan;Yeracaris Anthony M.;Gould Joel;Schon Andrew
分类号 H04L12/24 主分类号 H04L12/24
代理机构 代理人
主权项 1. A method for managing lineage information in a computing system, the method including: receiving, over an input device or port, an identification of a directed graph that includes one or more data transformation nodes that represent computations that transform data elements and one or more data nodes that represent data elements, and includes directed links that represent respective lineage relationships between a computation and a data element to be received or produced by the computation during execution of the computation; and computing, using at least one processor, summary information based on paths in the directed graph, and storing the summary information in one or more summary objects, the computing including receiving designation of interest for a plurality of the nodes of the directed graph; andgenerating one or more summary objects for remaining nodes not included in the plurality of nodes of interest, a first summary object of the one or more summary objects including summary information based on a first path between a first node of interest and a second node of interest that does include one or more of the remaining nodes and does not include any nodes of interest other than the first and second nodes.
地址 Lexington MA US