Phoenix: A Constraint-Aware Scheduler for Heterogeneous Datacenters

Prashanth Thinakaran, Jashwant Raj Gunasekaran, Bikash Sharma, Mahmut Taylan Kandemir, Chita R. Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Scopus citations

Abstract

Today's datacenters are increasingly becoming diverse with respect to both hardware and software architectures in order to support a myriad of applications. These applications are also heterogeneous in terms of job response times and resource requirements (eg., Number of Cores, GPUs, Network Speed) and they are expressed as task constraints. Constraints are used for ensuring task performance guarantees/Quality of Service(QoS) by enabling the application to express its specific resource requirements. While several schedulers have recently been proposed that aim to improve overall application and system performance, few of these schedulers consider resource constraints across tasks while making the scheduling decisions. Furthermore, latencycritical workloads and short-lived jobs that typically constitute about 90% of the total jobs in a datacenter have strict QoS requirements, which can be ensured by minimizing the tail latency through effective scheduling. In this paper, we propose Phoenix, a constraint-aware hybrid scheduler to address both these problems (constraint awareness and ensuring low tail latency) by minimizing the job response times at constrained workers. We use a novel Constraint Resource Vector (CRV) based scheduling, which in turn facilitates reordering of the jobs in a queue to minimize tail latency. We have used the publicly available Google traces to analyze their constraint characteristics and have embedded these constraints in Cloudera and Yahoo cluster traces for studying the impact of traces on system performance. Experiments with Google, Cloudera and Yahoo cluster traces across 15,000 worker node cluster shows that Phoenix improves the 99th percentile job response times on an average by 1.9× across all three traces when compared against a state-of-the-art hybrid scheduler. Further, in comparison to other distributed scheduler like Hawk, it improves the 90th and 99th percentile job response times by 4.5× and 5× respectively.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017
EditorsKisung Lee, Ling Liu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages977-987
Number of pages11
ISBN (Electronic)9781538617915
DOIs
StatePublished - Jul 13 2017
Event37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017 - Atlanta, United States
Duration: Jun 5 2017Jun 8 2017

Publication series

NameProceedings - International Conference on Distributed Computing Systems

Other

Other37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017
Country/TerritoryUnited States
CityAtlanta
Period6/5/176/8/17

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Phoenix: A Constraint-Aware Scheduler for Heterogeneous Datacenters'. Together they form a unique fingerprint.

Cite this