发明名称 Scalable bootstrap method for assessing the quality of machine learning algorithms over massive time series
摘要 Described is a system for assessing the quality of machine learning algorithms over massive time series. A set of random blocks of a time series data sample of size n is selected in parallel. Then, r resamples are generated, in parallel, by applying a bootstrapping method to each block in the set of random blocks to obtain a resample of size n, where r is not fixed. Errors are estimated on the r resamples, and a final accuracy estimate is produced by averaging the errors estimated on the r resamples.
申请公布号 US9530104(B1) 申请公布日期 2016.12.27
申请号 US201414175899 申请日期 2014.02.07
申请人 HRL Laboratories, LLC 发明人 Laptev Nikolay;Lu Tsai-Ching
分类号 G06F15/18;G06N99/00 主分类号 G06F15/18
代理机构 Tope-McKay & Associates 代理人 Tope-McKay & Associates
主权项 1. A system for assessing the quality of machine learning algorithms over time series, the system comprising: one or more processors and a non-transitory memory having instructions encoded thereon such that when the instructions are executed, the one or more processors perform operations of: selecting, in parallel, a set of random blocks of a time series data sample, wherein the time series data sample comprises a plurality of data points X1, . . . , Xn, wherein n is the number of data points in the time series data sample, and wherein the time series data sample is a sample of a much larger time series dataset; generating, in parallel, a set of resamples by applying a bootstrapping method to each block in the set of random blocks to obtain a resample for each block, wherein the number of resamples in the set of resamples is not fixed; determining errors on the set of resamples, wherein the errors represent variation within the set of resamples; producing a final accuracy estimate by averaging the errors estimated on the set of resamples, wherein the final accuracy estimate is an estimate of how accurately the set of random blocks represents the time series data sample; and using the final accuracy estimate to assess a quality of at least one machine learning algorithm over the time series dataset.
地址 Malibu CA US