CASH: A credit aware scheduling for public cloud platforms

Aakash Sharma, Saravanan Dhakshinamurthyy, George Kesidis, Chita R. Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Distributed data processing frameworks such as Hadoop, Tez, Spark, and Flink are exclusively used by public cloud tenants for executing large scale data analytics applications in various domains including but not limited to content management, financial sector, healthcare etc. These frameworks slice a job into a number of smaller tasks, which are then executed by a job scheduler on a multi-node compute cluster. While making scheduling decisions, the State-of-art schedulers employed in these frameworks assume hardware resources such as CPU, disk I/O and network I/O to offer a fixed service rate. However, in a public cloud environment, many of these resources are associated with burstable service rates. More specifically, the resources offer a guaranteed baseline service rate with an option to burst above their baseline rate by expending accumulated burst credits. Being unaware about this underlying hardware burstability, schedulers tend to make sub-optimal task placement decisions, thereby adversely affecting the job completion times, leading to higher deployment costs.In this paper, we propose CASH, a burst credit aware scheduler, which is cognizant about the burst credits associated with the individual hardware resources in the public cloud cluster. Through coarse grained task annotations depicting the burst credit demand of individual tasks and dynamically monitoring the credits for the underlying resources, CASH performs optimal task placement decisions. We prototype CASH on YARN, Hadoop, and Tez, and extensively evaluate it using both batch and streaming workloads. Our experimental results with CASH show CPU-credit based instances, like AWS T3, are a viable cost effective alternative when compared to self-managed offerings like Amazon EMR, for running large scale batch workloads. Furthermore, we demonstrate that CASH can accelerate streaming SQL queries on a large Hive database by up to 39.4% , leading to public cloud cost savings by up to 22%.

Original languageEnglish (US)
Title of host publicationProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
EditorsLaurent Lefevre, Stacy Patterson, Young Choon Lee, Haiying Shen, Shashikant Ilager, Mohammad Goudarzi, Adel N. Toosi, Rajkumar Buyya
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages227-236
Number of pages10
ISBN (Electronic)9781728195865
DOIs
StatePublished - May 2021
Event21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 - Virtual, Melbourne, Australia
Duration: May 10 2021May 13 2021

Publication series

NameProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021

Conference

Conference21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
Country/TerritoryAustralia
CityVirtual, Melbourne
Period5/10/215/13/21

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'CASH: A credit aware scheduling for public cloud platforms'. Together they form a unique fingerprint.

Cite this