发明名称 Apparatus and method for selecting a working data set for model development
摘要 The present invention provides a data selection apparatus which augments a set of training examples with the desired output data. The resulting augmented data set is normalized such that the augmented data values range between -1 and +1 and such that the mean of the augmented data set is zero. The data selection apparatus then groups the augmented and normalized data set into related clusters using a clusterizer. Preferably, the clusterizer is a neural network such as a Kohonen self-organizing map (SOM). The data selection apparatus further applies an extractor to cull a working set of data from the clusterized data set. The present invention thus picks, or filters, a set of data which is more nearly uniformly distributed across the portion of the input space of interest to minimize the maximum absolute error over the entire input space. The output of the data selection apparatus is provided to train the analyzer with important sub-sets of the training data rather than with all available training data. A smaller training data set significantly reduces the complexity of the model building or analyzer construction process.
申请公布号 US5809490(A) 申请公布日期 1998.09.15
申请号 US19960642779 申请日期 1996.05.03
申请人 发明人
分类号 G06K9/62;G06N3/08;(IPC1-7):G06F15/18 主分类号 G06K9/62
代理机构 代理人
主权项
地址