发明名称 |
FAST AND SCALABLE CONNECTED COMPONENT COMPUTATION |
摘要 |
Finding connected components in a graph is a well-known problem in a wide variety of application areas such as social network analysis, data mining, image processing, and etc. We present an efficient and scalable approach to find all the connected components in a given graph. We compare our approach with the state-of-the-art on a real-world graph. We also demonstrate the viability of our approach on a massive graph with ˜6B nodes and ˜92B edges on an 80-node Hadoop cluster. To the best of our knowledge, this is the largest graph publicly used in such an experiment. |
申请公布号 |
US2015269230(A1) |
申请公布日期 |
2015.09.24 |
申请号 |
US201514663141 |
申请日期 |
2015.03.19 |
申请人 |
Intelius Inc. |
发明人 |
KARDES Hakan;AGRAWAL Siddharth;WANG Xin;SUN Ang |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A system for finding connected components in a graph comprising:
an input device that receives a list of edges in the graph; a distributed processing arrangement coupled to the input device, the distributed processing arrangement including a plurality of processors that execute, in a distributed fashion, an iterative process that generates adjacency for each known node in the graph; the distributed processing arrangement further deduplicates:
wherein the output of the deduplicating comprises a mapping from each node in the graph to a corresponding component identifier. |
地址 |
Bellevue WA US |