Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing

Xianwei Cheng, Hui Zhao, Mahmut Kandemir, Saraju Mohanty, Beilei Jiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Edge computing and IoT applications are severely constrained by limited hardware resource. This makes memory-consuming DNN (Deep Neural Network) frameworks not applicable to edge computing. Simple algorithms such as direct convolution are finding their way in embedded machine learning. As one of the most widely used platforms for DNN acceleration, GPUs face the bottleneck of on-chip bandwidth. This work introduces a GPU DNN execution architecture that can relieve the on-chip bandwidth bottleneck by reducing data movement through opportunistic computing. We first investigate data access patterns in the hardware's view. Then we propose two opportunistic computing techniques to predictably perform computation when data is available with the help of assistant warps. By moving computation to data, our techniques are able to significantly reduce data movement and relieve the DNN execution bottleneck. Our evaluation results show that the proposed technique can improve DNN application performance as much as 55%.

Original languageEnglish (US)
Title of host publicationProceedings of the 21st International Symposium on Quality Electronic Design, ISQED 2020
PublisherIEEE Computer Society
Pages261-267
Number of pages7
ISBN (Electronic)9781728142074
DOIs
StatePublished - Mar 2020
Event21st International Symposium on Quality Electronic Design, ISQED 2020 - Santa Clara, United States
Duration: Mar 25 2020Mar 26 2020

Publication series

NameProceedings - International Symposium on Quality Electronic Design, ISQED
Volume2020-March
ISSN (Print)1948-3287
ISSN (Electronic)1948-3295

Conference

Conference21st International Symposium on Quality Electronic Design, ISQED 2020
Country/TerritoryUnited States
CitySanta Clara
Period3/25/203/26/20

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing'. Together they form a unique fingerprint.

Cite this