发明名称 Fast personalized page rank on map reduce
摘要 A personalized page rank computation system is described herein that provides a fast MapReduce method for Monte Carlo approximation of personalized PageRank vectors of all the nodes in a graph. The method presented is both faster and less computationally intensive than existing methods, allowing a broader scope of problems to be solved by existing computing hardware. The system adopts the Monte Carlo approach and provides a method to compute single random walks of a given length for all nodes in a graph that it is superior in terms of the number of map-reduce iterations among a broad class of methods. The resulting solution reduces the I/O cost and outperforms the state-of-the-art FPPR approximation methods, in terms of efficiency and approximation error. Thus, the system can very efficiently perform single random walks of a given length starting at each node in the graph and can very efficiently approximate all the personalized PageRank vectors.
申请公布号 US8856047(B2) 申请公布日期 2014.10.07
申请号 US201113164788 申请日期 2011.06.21
申请人 Microsoft Corporation 发明人 Chakrabarti Kaushik;Xin Dong;Bahmani Bahman
分类号 G06F17/10;G06F17/16;G06F17/17;G06F17/30 主分类号 G06F17/10
代理机构 代理人 Choi Dan;Swain Sandy;Minhas Micky
主权项 1. A computer-implemented method to generate random walks from source nodes in a graph, the method comprising: receiving an input graph; receiving an input walk length that specifies a number of nodes to be visited from a source node to produce a random walk from each source node in an input graph; receiving an input segment length that determines how long each segment generated by the method will be; determining a number of segments of the received input segment length to be generated out of each node of the input graph based on the received input walk length; generating the determined number of segments of the received segment length; and merging the generated segments iteratively, merging multiple segments per node in each iteration, until a random walk of the received input walk length is obtained from each node, wherein at least a portion of the preceding steps are performed by at least one processor.
地址 Redmond WA US