发明名称 Using views of subsets of nodes of a schema to generate data transformation jobs to transform input files in first data formats to output files in second data formats
摘要 Provided is a method for processing input data in a storage system and in communication with a repository. Views are generated that comprise a tree of nodes selected from a subset of nodes in a hierarchical representation of a schema. The views are saved to the repository. At least one of the views are used to create a job comprising a sequence of data transformation steps to transform the input data described by input schemas to the output data described by output schemas.
申请公布号 US9607061(B2) 申请公布日期 2017.03.28
申请号 US201514645328 申请日期 2015.03.11
申请人 International Business Machines Corporation 发明人 Holmes John C.;Jiang Ming;Li Jeff J.;Li Yong;Sotkowitz David S.
分类号 G06F17/30;G06F17/22 主分类号 G06F17/30
代理机构 Konrad, Raynes, Davda & Victor LLP 代理人 Victor David W.;Konrad, Raynes, Davda & Victor LLP
主权项 1. A computer program product for processing input data in a storage system and in communication with a repository, wherein the computer program product comprises a non-transitory computer readable storage medium having computer readable program code embodied therein that executes to perform operations, the operations comprising: generating views, each view comprising a tree of nodes selected from a subset of nodes in a hierarchical representation of a schema; saving the views to the repository; and using a selected view comprising at least one of the views to create a job comprising a sequence of data transformation steps to transform input data described by an input schema in a first data format to output data described by an output schema in a second data format, wherein the sequence of data transformation steps include operations on nodes of data from the input data that map results of the sequence of data transformation steps to the selected view, wherein the first data format has text delimited data or a database file and wherein the second data format defines a hierarchical representation of nodes that represent data and a relationship of data content.
地址 Armonk NY US