发明名称 |
Computer-implemented systems and methods for comparing and associating objects |
摘要 |
Computer-implemented systems and methods are disclosed for comparing and associating objects. In some embodiments, a method is provided for associating a first object with one or more objects within a plurality of objects, each object comprising a first plurality of properties, each property comprising data reflecting a characteristic of an entity represented by the object, the associated objects comprising matching data in corresponding properties for a second plurality of properties. The method may include executing, for each object within the plurality of objects and for the first object, the following: creating a slug for the object, the slug comprising the second plurality of properties from the object; and inputting the slug for the object into a Bloom filter. Further, the method may include creating for a bin within the Bloom filter corresponding to the slug for the first object, an association between objects whose slugs correspond to the bin if the slugs for those objects match. |
申请公布号 |
US8924389(B2) |
申请公布日期 |
2014.12.30 |
申请号 |
US201314140415 |
申请日期 |
2013.12.24 |
申请人 |
Palantir Technologies Inc. |
发明人 |
Elliot Mark;Chang Allen |
分类号 |
G06F7/00;G06F17/30 |
主分类号 |
G06F7/00 |
代理机构 |
Finnegan, Henderson, Farabow, Garrett & Dunner, LLP |
代理人 |
Finnegan, Henderson, Farabow, Garrett & Dunner, LLP |
主权项 |
1. A method for identifying unique objects within a plurality of objects, each object comprising a first plurality of properties, each property comprising data reflecting a characteristic of an entity represented by the object, the method comprising the following operations performed by one or more processors:
executing, for each object within the plurality of objects, the following: creating a slug for the object, the slug comprising a second plurality of properties from the object that includes at least some of the first plurality of properties; and inputting the slug for the object into a counting Bloom filter; identifying for each created slug whose corresponding bin within the counting Bloom filter has a count value equal to 1, the object associated with the slug as unique within the plurality of objects; inputting, using at least one processor, for each created slug, the slug and its corresponding object into a multimap, if a bin within the counting Bloom filter corresponding to the slug has a count value greater than 1, wherein the slug is a key to the multimap and the object is a value to the multimap; and identifying for each multimap key with one value, the object associated with the slug stored as the key as unique within the plurality of objects. |
地址 |
Palo Alto CA US |