发明名称 Computer-implemented systems and methods for comparing and associating objects
摘要 Computer-implemented systems and methods are disclosed for comparing and associating objects. In some embodiments, a method is provided for associating a first object with one or more objects within a plurality of objects, each object comprising a first plurality of properties, each property comprising data reflecting a characteristic of an entity represented by the object, the associated objects comprising matching data in corresponding properties for a second plurality of properties. The method may include executing, for each object within the plurality of objects and for the first object, the following: creating a slug for the object, the slug comprising the second plurality of properties from the object; and inputting the slug for the object into a Bloom filter. Further, the method may include creating for a bin within the Bloom filter corresponding to the slug for the first object, an association between objects whose slugs correspond to the bin if the slugs for those objects match.
申请公布号 US8924389(B2) 申请公布日期 2014.12.30
申请号 US201314140415 申请日期 2013.12.24
申请人 Palantir Technologies Inc. 发明人 Elliot Mark;Chang Allen
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 Finnegan, Henderson, Farabow, Garrett & Dunner, LLP 代理人 Finnegan, Henderson, Farabow, Garrett & Dunner, LLP
主权项 1. A method for identifying unique objects within a plurality of objects, each object comprising a first plurality of properties, each property comprising data reflecting a characteristic of an entity represented by the object, the method comprising the following operations performed by one or more processors: executing, for each object within the plurality of objects, the following: creating a slug for the object, the slug comprising a second plurality of properties from the object that includes at least some of the first plurality of properties; and inputting the slug for the object into a counting Bloom filter; identifying for each created slug whose corresponding bin within the counting Bloom filter has a count value equal to 1, the object associated with the slug as unique within the plurality of objects; inputting, using at least one processor, for each created slug, the slug and its corresponding object into a multimap, if a bin within the counting Bloom filter corresponding to the slug has a count value greater than 1, wherein the slug is a key to the multimap and the object is a value to the multimap; and identifying for each multimap key with one value, the object associated with the slug stored as the key as unique within the plurality of objects.
地址 Palo Alto CA US