摘要 |
Embodiments are described for a method for processing graph data by executing a Markov Clustering algorithm (MCL) to find clusters of vertices of the graph data, organizing the graph data by column by calculating a probability percentage for each column of a similarity matrix of the graph data to produce column data, generating a probability matrix of states of the column data, performing an expansion of the probability matrix by computing a power of the matrix using a Map-Reduce model executed in a processor-based computing device; and organizing the probability matrix into a set of sub-matrices to find the least amount of data needed for the Map-Reduce model given that two lines of data in the matrix are required to compute a single value for the power of the matrix. One of at least two strategies may be used to computing the power of the matrix (matrix square, M2) based on simplicity of execution or improved memory usage.
|