发明名称 Validating code of an extract, transform and load (ETL) tool
摘要 An approach for validating code for an extract, transform and load tool is provided. Naming, coding, and performance standards for the code are received. The code is exported to a job definition file and parsed. Violations of the standards are determined by a mismatch between the parsed code and the standards. A report identifying the violations is generated. Based on a review of the report and a rework of the code to comply with the standards, the reworked code is exported to another job definition file and parsed, and subsequently is determined to not include the violations of the standards. A second report is generated that indicates the reworked code does not include the violations. An approval of the reworked code is received based on the second report. Based on attributes of a job included in the code, a violation of one of the performance standards is determined.
申请公布号 US9547702(B2) 申请公布日期 2017.01.17
申请号 US201514954114 申请日期 2015.11.30
申请人 International Business Machines Corporation 发明人 Vilakkumadathil Rokky
分类号 G06F9/45;G06F17/30;G06F11/36;G06F9/445 主分类号 G06F9/45
代理机构 Schmeiser, Olsen & Watts 代理人 Schmeiser, Olsen & Watts ;Pivnichny John
主权项 1. A method of validating code of an extract, transform and load (ETL) tool, the method comprising the steps of: responsive to a receipt of naming, coding, and performance standards for the code of the ETL tool and an export of the code of the ETL tool to a job definition file, a computer parsing the code of the ETL tool in the job definition file; the computer determining violations of the naming, coding, and performance standards in part by determining the parsed code of the ETL tool does not match the naming, coding, and performance standards; the computer generating a report which identifies the violations; based at least in part on a review of the report and a rework of the code of the ETL tool to comply with the naming, coding and performance standards and responsive to an export of the reworked code of the ETL tool to another job definition file, the computer parsing the reworked code of the ETL tool in the other job definition file, determining that the parsed reworked code of the ETL tool does not include the violations of the naming, coding and performance standards, and generating a second report that indicates that the reworked code of the ETL tool does not include the violations; the computer receiving maximum numbers of aggregator stages of a job included in the code of the ETL tool, transformer stages of the job, occurrences of repartitioning of data sets in the job, sort stages of the job, database read/write operations of the job, and sequential file read/write operations of the job; the computer receiving a minimum ratio of a number of stages of the job to a number of stages of the job that are annotated; the computer receiving minimum sizes of a transaction for any insert, update or delete operation of the job and an array employed for any insert, update or delete operation of the job; and based on aggregator stages of the job exceeding the maximum number of aggregator stages, transformer stages of the job exceeding the maximum number of transformer stages of the job, occurrences of repartitioning of data sets in the job exceeding the maximum number of occurrences of repartitioning of data sets in the job, sort stages of the job exceeding the maximum number of sort stages, database read/write operations of the job exceeding the maximum number of database read/write operations, sequential file read/write operations of the job exceeding the maximum number of sequential file read/write operations, a ratio of the number of stages of the job to the number of stages of the job that are annotated being less than the minimum ratio of the number of stages to the number of stages that are annotated, a size of a transaction for an insert, update or delete operation of the job being less than the minimum size of the transaction, or a size of an array employed for an insert, update or delete operation of the job being less than the minimum size of the array, the computer determining a violation of a performance standard included in the naming, coding, and performance standards.
地址 Armonk NY US
您可能感兴趣的专利