TY - JOUR
T1 - An application-driven study of parallel system overheads and network bandwidth requirements
AU - Sivasubramaniam, Anand
AU - Singla, Aman
AU - Ramachandran, Umakishore
AU - Venkateswaran, H.
N1 - Funding Information:
This work has been funded in part by U.S. National Science Foundation Grants MIPS-9058430, MIPS-9200005, MIPS-9630145, and an equipment grant from DEC. Anand Sivasubramaniam was supported in part by a U.S. National Science Foundation Career Award MIPS-9701475.
PY - 1999
Y1 - 1999
N2 - Evaluating and analyzing the performance of a parallel application on an architecture to explain the disparity between projected and delivered performance is an important aspect of parallel systems research. However, conducting such a study is hard due to the vast design space of these systems. In this paper, we study two important aspects related to the performance of parallel applications on shared memory parallel architectures. First, we quantify overheads observed during the execution of these applications on three different simulated architectures. We next use these results to synthesize the bandwidth requirements for the applications with respect to different network topologies. This study is performed using an execution-driven simulation tool called SPASM, which provides a way of isolating and quantifying the different parallel system overheads in a nonintrusive manner. The first exercise shows that in shared memory machines with private caches, as long as the applications are well-structured to exploit locality, the key determinant that impacts performance is network connection. The second exercise quantifies the network bandwidth needed to minimize the effect of network connection. Specifically, it is shown that for the applications considered, as long as the problem sizes are increased commensurate with the system size, current network technologies supporting 200-300 MBytes/sec link bandwidth are sufficient to keep the network overheads (such as latency and contention) within acceptable bounds.
AB - Evaluating and analyzing the performance of a parallel application on an architecture to explain the disparity between projected and delivered performance is an important aspect of parallel systems research. However, conducting such a study is hard due to the vast design space of these systems. In this paper, we study two important aspects related to the performance of parallel applications on shared memory parallel architectures. First, we quantify overheads observed during the execution of these applications on three different simulated architectures. We next use these results to synthesize the bandwidth requirements for the applications with respect to different network topologies. This study is performed using an execution-driven simulation tool called SPASM, which provides a way of isolating and quantifying the different parallel system overheads in a nonintrusive manner. The first exercise shows that in shared memory machines with private caches, as long as the applications are well-structured to exploit locality, the key determinant that impacts performance is network connection. The second exercise quantifies the network bandwidth needed to minimize the effect of network connection. Specifically, it is shown that for the applications considered, as long as the problem sizes are increased commensurate with the system size, current network technologies supporting 200-300 MBytes/sec link bandwidth are sufficient to keep the network overheads (such as latency and contention) within acceptable bounds.
UR - http://www.scopus.com/inward/record.url?scp=0032656539&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0032656539&partnerID=8YFLogxK
U2 - 10.1109/71.755819
DO - 10.1109/71.755819
M3 - Article
AN - SCOPUS:0032656539
SN - 1045-9219
VL - 10
SP - 193
EP - 210
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 3
ER -