发明名称 System and method for continuous diagnosis of data streams
摘要 In connection with the mining of time-evolving data streams, a general framework that mines changes and reconstructs models from a data stream with unlabeled instances or a limited number of labeled instances. In particular, there are defined herein statistical profiling methods that extend a classification tree in order to guess the percentage of drifts in the data stream without any labelled data. Exact error can be estimated by actively sampling a small number of true labels. If the estimated error is significantly higher than empirical expectations, there preferably re-sampled a small number of true labels to reconstruct the decision tree from the leaf node level.
申请公布号 US2006010093(A1) 申请公布日期 2006.01.12
申请号 US20040880913 申请日期 2004.06.30
申请人 IBM CORPORATION 发明人 FAN WEI;WANG HAIXUN;YU PHILIP S.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址