Abstract
The Gyrokinetic Toroidal Code (GTC) uses the particle-in-cell method to efficiently simulate plasma microturbulence. This work presents novel analysis and optimization techniques to enhance the performance of GTC on large-scale machines. We introduce cell access analysis to better manage locality vs. synchronization tradeoffs on CPU and GPU-based architectures. Our optimized hybrid parallel implementation of GTC uses MPI, OpenMP, and NVIDIA CUDA, achieves up to a 2× speedup over the reference Fortran version on multiple parallel systems, and scales efficiently to tens of thousands of cores.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 454-473 |
| Number of pages | 20 |
| Journal | International Journal of High Performance Computing Applications |
| Volume | 27 |
| Issue number | 4 |
| DOIs | |
| State | Published - Nov 2013 |
All Science Journal Classification (ASJC) codes
- Software
- Theoretical Computer Science
- Hardware and Architecture
Fingerprint
Dive into the research topics of 'Analysis and optimization of gyrokinetic toroidal simulations on homogenous and heterogenous platforms'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver