摘要 |
A system, method and computer-readable medium are disclosed for identifying representative data using sketches. The method embodiment comprises generating a plurality of vectors from a data set, modifying each of the vectors of the plurality of vectors and selecting one of the plurality of generated vectors according to a comparison of a summed distance between a modified vector associated with the selected generated vector and remaining modified vectors. Modifying the generated vectors may involve reduced each generated vector to a lower dimensional vector. The summed distance then represents a summed distance between the lower dimensional vector and remaining lower dimensional vectors.
|