TY - GEN
T1 - PuLP
T2 - 2nd IEEE International Conference on Big Data, IEEE Big Data 2014
AU - Slota, George M.
AU - Madduri, Kamesh
AU - Rajamanickam, Sivasankaran
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014
Y1 - 2014
N2 - We present PuLP, a parallel and memory-efficient graph partitioning method specifically designed to partition low-diameter networks with skewed degree distributions. Graph partitioning is an important Big Data problem because it impacts the execution time and energy efficiency of graph analytics on distributed-memory platforms. Partitioning determines the in-memory layout of a graph, which affects locality, intertask load balance, communication time, and overall memory utilization of graph analytics. A novel feature of our method PuLP (Partitioning using Label Propagation) is that it optimizes for multiple objective metrics simultaneously, while satisfying multiple partitioning constraints. Using our method, we are able to partition a web crawl with billions of edges on a single compute server in under a minute. For a collection of test graphs, we show that PuLP uses 8-39× less memory than state-of-the-art partitioners and is up to 14.5× faster, on average, than alternate approaches (with 16-way parallelism). We also achieve better partitioning quality results for the multi-objective scenario.
AB - We present PuLP, a parallel and memory-efficient graph partitioning method specifically designed to partition low-diameter networks with skewed degree distributions. Graph partitioning is an important Big Data problem because it impacts the execution time and energy efficiency of graph analytics on distributed-memory platforms. Partitioning determines the in-memory layout of a graph, which affects locality, intertask load balance, communication time, and overall memory utilization of graph analytics. A novel feature of our method PuLP (Partitioning using Label Propagation) is that it optimizes for multiple objective metrics simultaneously, while satisfying multiple partitioning constraints. Using our method, we are able to partition a web crawl with billions of edges on a single compute server in under a minute. For a collection of test graphs, we show that PuLP uses 8-39× less memory than state-of-the-art partitioners and is up to 14.5× faster, on average, than alternate approaches (with 16-way parallelism). We also achieve better partitioning quality results for the multi-objective scenario.
UR - http://www.scopus.com/inward/record.url?scp=84921727521&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84921727521&partnerID=8YFLogxK
U2 - 10.1109/BigData.2014.7004265
DO - 10.1109/BigData.2014.7004265
M3 - Conference contribution
AN - SCOPUS:84921727521
T3 - Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014
SP - 481
EP - 490
BT - Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014
A2 - Lin, Jimmy
A2 - Pei, Jian
A2 - Hu, Xiaohua Tony
A2 - Chang, Wo
A2 - Nambiar, Raghunath
A2 - Aggarwal, Charu
A2 - Cercone, Nick
A2 - Honavar, Vasant
A2 - Huan, Jun
A2 - Mobasher, Bamshad
A2 - Pyne, Saumyadipta
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 October 2014 through 30 October 2014
ER -