发明名称 JOINS AND AGGREGATIONS ON MASSIVE GRAPHS USING LARGE-SCALE GRAPH PROCESSING
摘要 This disclosure is directed to large-scale graph processing to determine second-degree connections for members of a social network. A social graph is duplicated into two graphs, where each of the two graphs are partitioned into various partitions. The partitions are each sorted according to a predetermined key selected from each of the graphs. The partitions are then assigned logical Work Units, where a first set of Work Units are determined from a first graph and second set of Work Units are determined from a second graph. The Work Units are determined to be asymmetrical such that the partitions of the first set of Work Units are assigned differently than the partitions of the second set of Work Units. One set of Work Units are loaded in-memory and another set of Work Units are streamed to a mapping module process, which determines the second-degree connections from the sets of Work Units.
申请公布号 US2016253389(A1) 申请公布日期 2016.09.01
申请号 US201615056996 申请日期 2016.02.29
申请人 Linkedln Corporation 发明人 Vemuri Srinivas S.;Xie Wenlei;Pyne Suvodeep;Gankidi Vinitha Reddy;Varshney Maneesh;Tiwari Mitul
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A system comprising: a machine-readable medium storing computer-executable instructions; and at least one hardware processor communicatively coupled to the machine-readable medium that, when the computer-executable instructions are executed, configures the system to: retrieve a social graph for a social networking service, the social graph including a first plurality of nodes corresponding to members of the social networking service and a first plurality of edges connecting the first plurality of nodes;partition the first plurality of nodes into a first plurality of partitions, the first plurality of partitions being determined according to a source key associated with the first plurality of nodes;partition the first plurality of nodes into a second plurality of partitions, the second plurality of partitions being determined according to a destination key associated with the first plurality of nodes;assign each of the first plurality of partitions to a work unit of a first plurality of work units, wherein the first plurality of work units are determined according to a constraint on computing resources;assign each of the second plurality of partitions to a work unit of a second plurality of work units; anddetermine a second plurality of nodes from the social graph using at least one pair of work units constructed from a first work unit selected from the first plurality of work units and a second work unit selected from the second plurality of work units, the second plurality of nodes including at least one node representing a second-degree connection for a member of the social networking service.
地址 Mountain View CA US