TY - JOUR
T1 - Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
AU - Wang, Bei
AU - Ethier, Stephane
AU - Tang, William
AU - Ibrahim, Khaled Z.
AU - Madduri, Kamesh
AU - Williams, Samuel
AU - Oliker, Leonid
N1 - Publisher Copyright:
© The Author(s) 2017.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - The gyrokinetic toroidal code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5-D Vlasov–Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P’s multiple levels of parallelism, including internode 2-D domain decomposition and particle decomposition, as well as intranode shared memory partition and vectorization, have enabled pushing the scalability of the PIC method to extreme computational scales. In this article, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) coprocessors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of ion–temperature–gradient driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects, and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.
AB - The gyrokinetic toroidal code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5-D Vlasov–Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P’s multiple levels of parallelism, including internode 2-D domain decomposition and particle decomposition, as well as intranode shared memory partition and vectorization, have enabled pushing the scalability of the PIC method to extreme computational scales. In this article, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) coprocessors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of ion–temperature–gradient driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects, and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.
UR - http://www.scopus.com/inward/record.url?scp=85041549508&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85041549508&partnerID=8YFLogxK
U2 - 10.1177/1094342017712059
DO - 10.1177/1094342017712059
M3 - Article
AN - SCOPUS:85041549508
SN - 1094-3420
VL - 33
SP - 169
EP - 188
JO - International Journal of High Performance Computing Applications
JF - International Journal of High Performance Computing Applications
IS - 1
ER -