发明名称 SELF-ANALYZING DATA PROCESSING JOB TO DETERMINE DATA QUALITY ISSUES
摘要 Techniques are disclosed to determine data quality issues in data processing jobs. The data processing job is received, the data processing job specifying one or more processing steps designed based on one or more data schemas and further specifies one or more desired quality metrics to measure at the one or more processing steps. One or more state machines are provided, that are generated based on the quality metrics and on the data schemas. Input data to the data process job are processed using the one or more state machines, in order to generate output data and a set of data quality records characterizing a set of data quality issues identified during the execution of the data processing job.
申请公布号 US2014279835(A1) 申请公布日期 2014.09.18
申请号 US201414224864 申请日期 2014.03.25
申请人 International Business Machines Corporation 发明人 LI Jeff J.;LI Yong
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computer-implemented method to determine data quality issues in data processing jobs, the method comprising: receiving a data processing job specifying one or more processing steps designed based on one or more data schemas and further specifying one or more desired quality metrics to measure at the one or more processing steps; providing one or more state machines generated based on the quality metrics and on the data schemas; processing input data for the data processing job by operation of one or more computer processors and using the one or more state machines, in order to generate output data for the data processing job and a set of data quality records characterizing a set of data quality issues identified during execution of the data processing job; and outputting the generated set of data quality records.
地址 Armonk NY US
您可能感兴趣的专利