发明名称 Performance checking component for an ETL job
摘要 Generation of a performance determination report for an Extract, Transform, Load (ETL) job includes decomposing the ETL job into two or more stage instances, and identifying one or more conditions for each of the stage instances. A set of tests for each of the identified conditions are generated. A first set of test results are generated by performing the set of tests. It is determined whether a test result from the first set of test results is outside of a first range. Conditions that can be identified include a non-volatile free memory condition, a network reliability condition, a network configuration condition, an application availability condition, a database availability condition, a database performance condition, a schema validity condition, an installed libraries condition, a configuration parameter condition, a volatile free memory condition, and a third party tool condition.
申请公布号 US9626416(B2) 申请公布日期 2017.04.18
申请号 US201414291421 申请日期 2014.05.30
申请人 International Business Machines Corporation 发明人 Li Jeff J.;Nusbickel Wendi L.;Tsimis James G.
分类号 G06F17/30;G06F11/34;G06F11/22;G06F11/30;G06F15/16 主分类号 G06F17/30
代理机构 代理人 Montanaro Jared L.
主权项 1. A computer-implemented method performed by a processor for generating a performance determination report for an Extract, Transform, Load (ETL) job, comprising: decomposing an ETL job into two or more stage instances, wherein the two or more stage instances include a first extraction stage instance and a second extraction stage instance; identifying one or more conditions for each of the stage instances, wherein the one or more conditions include a network reliability condition for the first and second extraction stage instances; generating a set of tests for each of the identified conditions, wherein the set of tests for the network reliability condition includes a ping test; generating a first set of test results by performing the sets of tests; determining a first range for a test result from the first set of test results, wherein determining the first range includes calculating one or more statistical metrics from two or more historical test results, wherein calculating the one or more statistical metrics includes calculating a standard deviation and a measure of dispersion from the two or more historical test results; determining whether the test result from the first set of test results is outside of the first range, wherein the determining of whether the test result from the first set of test results is outside the first range includes comparing the test result with another test result for the same test performed at a second time, the second time being prior to a first time, wherein the generating of the first set of test results is performed at the first time, and wherein the same test performed is a temporary memory space test that includes determining the amount of volatile memory that is available at a compute node at the first and second times; and generating the performance determination report, the performance determination report including the test result.
地址 Armonk NY US