TY - GEN
T1 - Brief announcement
T2 - 24th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA'12
AU - Chandramowlishwaran, Aparna
AU - Choi, Jee Whan
AU - Madduri, Kamesh
AU - Vuduc, Richard
PY - 2012
Y1 - 2012
N2 - This paper presents the first in-depth models for compute and memory costs of the kernel-independent Fast Multipole Method (KIFMM). The Fast Multiple Method (FMM) has asymptotically linear time complexity with a guaranteed approximation accuracy, making it an attractive candidate for a wide variety of particle system simulations on future exascale systems. This paper reports on three key advances. First, we present lower bounds on cache complexity for key phases of the FMM and use these bounds to derive analytical performance models. Secondly, using these models, we present results for choosing the optimal algorithmic tuning parameter. Lastly, we use these performance models to make predictions about FMM's scalability on possible exascale system configurations, based on current technology trends. Looking forward to exascale, we suggest that the FMM, though highly compute-bound on today's systems, could in fact become memory-bound by 2020. Copyright is held by the author/owner(s).
AB - This paper presents the first in-depth models for compute and memory costs of the kernel-independent Fast Multipole Method (KIFMM). The Fast Multiple Method (FMM) has asymptotically linear time complexity with a guaranteed approximation accuracy, making it an attractive candidate for a wide variety of particle system simulations on future exascale systems. This paper reports on three key advances. First, we present lower bounds on cache complexity for key phases of the FMM and use these bounds to derive analytical performance models. Secondly, using these models, we present results for choosing the optimal algorithmic tuning parameter. Lastly, we use these performance models to make predictions about FMM's scalability on possible exascale system configurations, based on current technology trends. Looking forward to exascale, we suggest that the FMM, though highly compute-bound on today's systems, could in fact become memory-bound by 2020. Copyright is held by the author/owner(s).
UR - http://www.scopus.com/inward/record.url?scp=84864128713&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84864128713&partnerID=8YFLogxK
U2 - 10.1145/2312005.2312039
DO - 10.1145/2312005.2312039
M3 - Conference contribution
AN - SCOPUS:84864128713
SN - 9781450312134
T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures
SP - 182
EP - 184
BT - SPAA'12 - Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures
Y2 - 25 June 2012 through 27 June 2012
ER -