Heterogeneous MacroTasking (HEMT) for Parallel Processing in the Cloud

Yuquan Shan, George Kesidis, Aman Jain, Bhurvan Urgaonkar, Jalal Khamse-Ashari, Ioannis Lambadaris

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Using tiny tasks (microtasks) has long been regarded an effective way of load balancing in parallel computing systems. When combined with containerized execution nodes pulling in work upon becoming idle, microtasking has the desirable property of automatically adapting its load distribution to the processing capacities of participating nodes-more powerful nodes finish their work sooner and, therefore, pull in additional work faster. As a result, microtasking is deemed especially desirable in settings with heterogeneous processing capacities and poorly characterized workloads. However, microtasking does have additional scheduling and I/O overheads that may make it costly in some scenarios. Moreover, the optimal task size generally needs to be learned. We herein study an alternative load balancing scheme-Heterogeneous MacroTasking (HEMT)-wherein workload is intentionally skewed according to the nodes' processing capacity. We implemented and open-sourced a prototype of HEMT within the Apache Spark application framework and conducted experiments using the Apache Mesos cluster manager. It's shown experimentally that when workload-specific estimates of nodes' processing capacities are learned, Spark with HEMT offers up to 10% shorter average completion times for realistic, multistage data-processing workloads over the baseline Homogeneous microTasking (HomT) system.

Original languageEnglish (US)
Title of host publicationWOC 2020 - Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds, Part of Middleware 2020
PublisherAssociation for Computing Machinery, Inc
Pages7-12
Number of pages6
ISBN (Electronic)9781450382090
DOIs
StatePublished - Dec 7 2020
Event6th International Workshop on Container Technologies and Container Clouds, WOC 2020 - Part of Middleware 2020 - Virtual, Online, Netherlands
Duration: Dec 7 2020Dec 11 2020

Publication series

NameWOC 2020 - Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds, Part of Middleware 2020

Conference

Conference6th International Workshop on Container Technologies and Container Clouds, WOC 2020 - Part of Middleware 2020
Country/TerritoryNetherlands
CityVirtual, Online
Period12/7/2012/11/20

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Heterogeneous MacroTasking (HEMT) for Parallel Processing in the Cloud'. Together they form a unique fingerprint.

Cite this