摘要 |
A content-addressable and searchable storage system for managing and exploring massive amounts of feature-rich data such as images, audio or scientific data, is shown. The system comprises a segmentation and feature extraction unit for segmenting data corresponding to an object into a plurality of data segments and generating a feature vector for each data segment; a sketch construction component for converting a feature vector into a compact bit-vector corresponding to the object; a similarity index comprising a plurality of compact bit-vectors corresponding to a plurality of objects; and an index insertion component for inserting a compact bit-vector corresponding to an object into the similarity index. The system may further comprise an indexing unit for identifying a candidate set of objects from said similarity index based upon a compact bit-vector corresponding to a query object. Still further, the system may additionally comprise a similarity ranking component for ranking objects in said candidate set by estimating their distances to the query object.
|