TY - JOUR
T1 - Sharc
T2 - Managing CPU and Network Bandwidth in Shared Clusters
AU - Urgaonkar, Bhuvan
AU - Shenoy, Prashant
N1 - Funding Information:
The authors would like to thank the reviewers for their insightful comments. This research was supported in part by the US National Science Foundation grants CCR-9984030 and EIA-0080119.
PY - 2004/1
Y1 - 2004/1
N2 - In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc - a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares, and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources - CPU and network interface bandwidth - on a cluster-wide basis. Our techniques allow Sharc to 1) support reservation of CPU and network interface bandwidth for distributed applications, 2) dynamically allocate resources based on past usage, and 3) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.
AB - In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc - a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares, and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources - CPU and network interface bandwidth - on a cluster-wide basis. Our techniques allow Sharc to 1) support reservation of CPU and network interface bandwidth for distributed applications, 2) dynamically allocate resources based on past usage, and 3) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.
UR - http://www.scopus.com/inward/record.url?scp=0742303481&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0742303481&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2004.1264781
DO - 10.1109/TPDS.2004.1264781
M3 - Article
AN - SCOPUS:0742303481
SN - 1045-9219
VL - 15
SP - 2
EP - 17
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 1
ER -