发明名称 Dynamic outlier bias reduction system and method
摘要 A system and method is described herein for data filtering to reduce functional, and trend line outlier bias. Outliers are removed from the data set through an objective statistical method. Bias is determined based on absolute, relative error, or both. Error values are computed from the data, model coefficients, or trend line calculations. Outlier data records are removed when the error values are greater than or equal to the user-supplied criteria. For optimization methods or other iterative calculations, the removed data are re-applied each iteration to the model computing new results. Using model values for the complete dataset, new error values are computed and the outlier bias reduction procedure is re-applied. Overall error is minimized for model coefficients and outlier removed data in an iterative fashion until user defined error improvement limits are reached. The filtered data may be used for validation, outlier bias reduction and data quality operations.
申请公布号 US9111212(B2) 申请公布日期 2015.08.18
申请号 US201313772212 申请日期 2013.02.20
申请人 HARTFORD STEAM BOILER INSPECTION AND INSURANCE COMPANY 发明人 Jones Richard Bradley
分类号 G06N7/00;G06F17/18;G06K9/62 主分类号 G06N7/00
代理机构 Greenberg Traurig, LLP 代理人 Greenberg Traurig, LLP
主权项 1. A system for reducing outlier bias in target variables measured for a facility, comprising: an input unit for inputting one or more data sets to be processed, wherein the input unit comprises a measuring device configured to: measure one or more target variables for the facility; andprovide a corresponding data set for each of the target variables; a computing unit coupled to the input unit and for processing the data sets, wherein the computing unit comprises a processor and a storage subsystem; and an output unit coupled to the computing unit and for outputting one or more of the processed data sets received from the computing unit, wherein a computer program stored by the storage subsystem comprises instructions that, when executed reduces outlier bias for one of the processed data sets by causing the processor to: select one of the target variables for the reduction of outlier bias;obtain a complete data set of the one of the target variables from the input unit, wherein the complete data set of the one of the target variables comprises a plurality of inputted data values;obtain a bias criteria used to determine one or more outliers;determine a set of model coefficients for a mathematical model;(1) apply the mathematical model with the set of model coefficients to the complete data set to determine a set of model predicted values;(2) generate an error set by comparing the set of model predicted values to corresponding actual values of the complete data set;(3) generate a set of error threshold values from the error set and the bias criteria;(4) generate a removed data set comprising elements of the complete data set with corresponding error set values outside the set of error threshold values;(5) generate a censored data set comprising all elements of the complete data set that are not within the removed data set;(6) determine a set of updated model coefficients for the mathematical model based on the censored data set; and(7) repeat steps (1)-(6) as an iteration unless a censoring performance termination criteria is satisfied, whereby at the iteration the set of predicted values, the error set, the set of error threshold values, the removed data set, and the censored data set are generated using the set of updated model coefficients.
地址 Hartford CT US