发明名称 Date updating in support of data analysis
摘要 A computing device updates date values in a read dataset to support data analytics. Outlier and non-outlier date values are identified by, for each date value as a respective date value, reading a predefined number of neighboring date values relative to the respective date value; computing a median value and a median absolute deviation value of the predefined number of neighboring date values; computing a difference between the respective date value and the median value; dividing an absolute value of the difference by the median absolute deviation value to define a deviation value; comparing the deviation value to a threshold deviation value; and, based on the comparison, identifying the respective date value as an outlier or a non-outlier date value. Each identified non-outlier date value is updated with a new date computed using a date offset value. Each updated, identified non-outlier date value is replaced in a date updated dataset.
申请公布号 US9524315(B1) 申请公布日期 2016.12.20
申请号 US201615222329 申请日期 2016.07.28
申请人 SAS Institute Inc. 发明人 Bonham Robert N.;Holzworth Steven C.;Hayes Keefe
分类号 G06F11/00;G06F17/30 主分类号 G06F11/00
代理机构 Bell & Manning, LLC 代理人 Bell & Manning, LLC
主权项 1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to: read a dataset; identify date values in the read dataset; identify outlier and non-outlier date values included in the identified date values by, for each date value of the identified date values as a respective date value, reading a predefined number of neighboring date values relative to the respective date value;computing a median value and a median absolute deviation value of the read predefined number of neighboring date values;computing a difference between the respective date value and the computed median value;dividing an absolute value of the computed difference by the computed median absolute deviation value to define a deviation value of the respective date value;comparing the defined deviation value to a threshold deviation value; andbased on the comparison, identifying the respective date value as either an outlier date value or a non-outlier date value; determine a date offset value; update each identified non-outlier date value read from the dataset with a new date computed using the determined date offset value; and store the read dataset to a date updated dataset by replacing each identified non-outlier date value with a corresponding updated date value.
地址 Cary NC US