发明名称 Data stream quality management for analytic environments
摘要 According to one aspect of the present disclosure, a system and technique for data quality management is disclosed. The system includes a processor and an ingress quality specification (IQS) module executable by the processor in a runtime environment with a data stream analytic module. The IQS module is configured to: receive the data stream; analyze a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module; annotate the subset of data to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module; and output the data stream to the analytic module.
申请公布号 US9460131(B2) 申请公布日期 2016.10.04
申请号 US201213463850 申请日期 2012.05.04
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 George Randy;McKeown Robert J.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人 Baudino James L.
主权项 1. A system, comprising: an analytic module configured to analyze a data stream output by an object and output an analysis of the object, the analytic module having a data quality expectation of data of the data stream; a memory storing a plurality of ingress quality specification (IQS) modules each corresponding to a different quality characteristic and each associated with the analytic module; and an interface configured to enable a selection of at least one IQS module from the plurality of IQS modules to deploy with the analytic module to: receive the data stream from the object;analyze a subset of data of the data stream to determine if the subset of data meets the quality expectation of the analytic module;modify the subset of data by annotating the subset of data to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module; andoutput the data stream to the analytic module; and wherein the analytic module is configured to receive the data stream from the IQS module, identify data not meeting the quality expectation based on the annotations, and omit from its analysis of the object the data not meeting the quality expectation; and wherein the selected IQS module is configured to determine whether a selected subset of data of the data stream includes a minimum quantity of data samples based on the quality expectation of the analytic module.
地址 Armonk NY US