TY - JOUR
T1 - Impact of virtual channels and adaptive routing on application performance
AU - Vaidya, Aniruddha S.
AU - Sivasubramaniam, Anand
AU - Das, Chita R.
N1 - Funding Information:
degree in computer science from the Indian Institute of Technology, Madras, in 1989, and the MS and PhD degrees in computer science from the Georgia Institute of Technology in 1991 and 1995, respectively. Since 1995, he has been with the Department of Computer Science and Engineering at the Pennsylvania State Univer-sity, where is currently an associate professor. His research interests are in the areas of computer architecture, operating systems, and performance evaluation, with a focus on parallel and distributed computing. He is the recipient of a US National Science Foundation Career Award, and is a member of the ACM, the IEEE, and the IEEE Computer Society.
Funding Information:
This research was supported in part by the US National Science Foundation (NSF) under grants MIP-9634197, CCR-9900701, an NSF Career Award MIP-9701475, and equipment grants from NSF and IBM. A preliminary version of this paper, titled “Performance Benefits of Virtual Channels and Adaptive Routing: An Application-Driven Study,” was presented at the 11th ACM International Conference on Supercomputing, Vienna, Austria, July 1997.
PY - 2001/2
Y1 - 2001/2
N2 - Research on multiprocessor interconnection networks has primarily focused on wormhole switching, virtual channel flow control, and routing algorithms to enhance their performance. The rationale behind this research is that by alleviating the network latency for high network loads, the overall system performance would improve. Many studies have used synthetic workloads to support this claim. However, such workloads may not necessarily capture the behavior of real applications. In this paper, we have used parallel applications for a closer examination of the network behavior. In particular, the performance benefit from enhancing a 2D mesh with virtual channels (VCs) and a fully adaptive routing algorithm is examined with a set of shared-memory and message passing applications. Execution time and average message latency of shared memory applications are measured using execution-driven simulation and by varying many architectural attributes that affect the network workload. The communication traces of message passing applications, collected on an IBM-SP2, are used to run a trace-driven simulation of the mesh architecture to obtain message latency. Simulation results show that VCs and adaptive routing can reduce the network latency to varying degrees depending on the application. However, these modest benefits do not translate to significant improvements in the overall execution time because the load on the network is not high enough to exploit the advantages of the network enhancements. Moreover, this benefit may be negated if the architectural enhancements increase the network cycle time. Rather, emphasis should be placed on improving the raw network bandwidth and faster network interfaces.
AB - Research on multiprocessor interconnection networks has primarily focused on wormhole switching, virtual channel flow control, and routing algorithms to enhance their performance. The rationale behind this research is that by alleviating the network latency for high network loads, the overall system performance would improve. Many studies have used synthetic workloads to support this claim. However, such workloads may not necessarily capture the behavior of real applications. In this paper, we have used parallel applications for a closer examination of the network behavior. In particular, the performance benefit from enhancing a 2D mesh with virtual channels (VCs) and a fully adaptive routing algorithm is examined with a set of shared-memory and message passing applications. Execution time and average message latency of shared memory applications are measured using execution-driven simulation and by varying many architectural attributes that affect the network workload. The communication traces of message passing applications, collected on an IBM-SP2, are used to run a trace-driven simulation of the mesh architecture to obtain message latency. Simulation results show that VCs and adaptive routing can reduce the network latency to varying degrees depending on the application. However, these modest benefits do not translate to significant improvements in the overall execution time because the load on the network is not high enough to exploit the advantages of the network enhancements. Moreover, this benefit may be negated if the architectural enhancements increase the network cycle time. Rather, emphasis should be placed on improving the raw network bandwidth and faster network interfaces.
UR - http://www.scopus.com/inward/record.url?scp=0035248113&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0035248113&partnerID=8YFLogxK
U2 - 10.1109/71.910875
DO - 10.1109/71.910875
M3 - Article
AN - SCOPUS:0035248113
SN - 1045-9219
VL - 12
SP - 223
EP - 237
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 2
ER -