USE OF SEQUENTIAL CLUSTERING FOR INSTANCE SELECTION IN MACHINE CONDITION MONITORING
摘要
<p>A method is provided for selecting a representative set of training data for training a statistical model in a machine condition monitoring system. The method reduces the time required to choose representative samples from a large data set by using a nearest-neighbor sequential clustering technique in combination with akd-tree. A distance threshold is used to limit the geometric size the clusters. Each node of the kd-tree is assigned a representative sample from the training data, and similar samples are subsequently discarded.</p>