Controlled Kernel Launch for Dynamic Parallelism in GPUs

Xulong Tang, Ashutosh Pattnaik, Huaipan Jiang, Onur Kayiran, Adwait Jog, Sreepathi Pai, Mohamed Ibrahim, Mahmut T. Kandemir, Chita R. Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

38 Scopus citations

Abstract

Dynamic parallelism (DP) is a promising feature for GPUs, which allows on-demand spawning of kernels on the GPU without any CPU intervention. However, this feature has two major drawbacks. First, the launching of GPU kernels can incur significant performance penalties. Second, dynamically-generated kernels are not always able to efficiently utilize the GPU cores due to hardware-limits. To address these two concerns cohesively, we propose SPAWN, a runtime framework that controls the dynamically-generated kernels, thereby directly reducing the associated launch overheads and queuing latency. Moreover, it allows a better mix of dynamically-generated and original (parent) kernels for the scheduler to effectively hide the remaining overheads and improve the utilization of the GPU resources. Our results show that, across 13 benchmarks, SPAWN achieves 69% and 57% speedup over the flat (non-DP) implementation and baseline DP, respectively.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE 23rd Symposium on High Performance Computer Architecture, HPCA 2017
PublisherIEEE Computer Society
Pages649-660
Number of pages12
ISBN (Electronic)9781509049851
DOIs
StatePublished - May 5 2017
Event23rd IEEE Symposium on High Performance Computer Architecture, HPCA 2017 - Austin, United States
Duration: Feb 4 2017Feb 8 2017

Publication series

NameProceedings - International Symposium on High-Performance Computer Architecture
ISSN (Print)1530-0897

Other

Other23rd IEEE Symposium on High Performance Computer Architecture, HPCA 2017
Country/TerritoryUnited States
CityAustin
Period2/4/172/8/17

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Controlled Kernel Launch for Dynamic Parallelism in GPUs'. Together they form a unique fingerprint.

Cite this