摘要 |
Computer-implemented systems and methods are disclosed for comparing and associating objects. In some embodiments, a method is provided for associating a first object with one or more objects within a plurality of objects, each object comprising a first plurality of properties, each property comprising data reflecting a characteristic of an entity represented by the object, the associated objects comprising matching data in corresponding properties for a second plurality of properties. The method may include executing, for each object within the plurality of objects and for the first object, the following: creating a slug for the object, the slug comprising the second plurality of properties from the object; and inputting the slug for the object into a Bloom filter. Further, the method may include creating for a bin within the Bloom filter corresponding to the slug for the first object, an association between objects whose slugs correspond to the bin if the slugs for those objects match. SIZE A BLOOM FILTER FOR A TARGET ERROR RATE FOR THE CORPUS SIZE GENERATE A SLUG FOR 100 THE TARGET OBJECT DETERMINE THE BLOOM FILTER BIN CORRESPONDING TO THE SLUG FOR THE TARGET OBJECT GENERATE A SLUG FOR EACH OBJECT IN THE CORPUS DETERMINE THE BLOOM FILTER BIN CORRESPONDING TO THE SLUG FOR EACH OBJECT IN THE CORPUS FOR EACH SLUG FOR AN OBJECT IN THE CORPUS WHOSE BIN IS THE SA M E BIN AS THE SLUG FOR THE TARGET OBJECT., ADD THE SLUG FOR THE OBJECT IN THE CORPUS AND ITS CORRESPONDING OBJECT TO A MULTIMAP OUTPUT THE MULTIMAP VALUES WHOSE KEYS MATCH THE SLUG FOR THE TARGET OBJECT |