Skipper: Enabling efficient SNN training through activation-checkpointing and time-skipping

Sonali Singh, Anup Sarma, Sen Lu, Abhronil Sengupta, Mahmut T. Kandemir, Emre Neftci, Vijaykrishnan Narayanan, Chita R. Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Spiking neural networks (SNNs) are a highly efficient signal processing mechanism in biological systems that have inspired a plethora of research efforts aimed at translating their energy efficiency to computational platforms. Efficient training approaches are critical for the successful deployment of SNNs. Compared to mainstream deep neural networks (ANNs), training SNNs is far more challenging due to complex neural dynamics that evolve with time and their discrete, binary computing paradigm. Back-propagation-through-time (BPTT) with surrogate gradients has recently emerged as an effective technique to train deep SNNs directly. SNN-BPTT, however, has a major drawback in that it has a high memory requirement that increases with the number of timesteps. SNNs generally result from the discretization of Ordinary Differential Equations, due to which the sequence length must be typically longer than RNNs, compounding the time dependence problem. It, therefore, becomes hard to train deep SNNs on a single or multi-GPU setup with sufficiently large batch sizes or timesteps, and extended periods of training are required to achieve reasonable network performance. In this work, we reduce the memory requirements of BPTT in SNNs to enable the training of deeper SNNs with more timesteps (T). For this, we leverage the notion of activation re-computation in the context of SNN training that enables the GPU memory to scale sub-linearly with increasing time-steps. We observe that naively deploying the re-computation based approach leads to a considerable computational overhead. To solve this, we propose a time-skipped BPTT approximation technique, called Skipper, for SNNs, that not only alleviates this computation overhead, but also lowers memory consumption further with little to no loss of accuracy. We show the efficacy of our proposed technique by comparing it against a popular method for memory footprint reduction during training. Our evaluations on 5 state-of-the-art networks and 4 datasets show that for a constant batch size and time-steps, skipper reduces memory usage by 3.3× to 8.4× (6.7× on average) over baseline SNN-BPTT. It also achieves a speedup of 29% to 70% over the checkpointed approach and of 4% to 40% over the baseline approach. For a constant memory budget, skipper can scale to an order of magnitude higher timesteps compared to baseline SNN-BPTT.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022
PublisherIEEE Computer Society
Pages565-581
Number of pages17
ISBN (Electronic)9781665462723
DOIs
StatePublished - 2022
Event55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022 - Chicago, United States
Duration: Oct 1 2022Oct 5 2022

Publication series

NameProceedings of the Annual International Symposium on Microarchitecture, MICRO
Volume2022-October
ISSN (Print)1072-4451

Conference

Conference55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022
Country/TerritoryUnited States
CityChicago
Period10/1/2210/5/22

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Skipper: Enabling efficient SNN training through activation-checkpointing and time-skipping'. Together they form a unique fingerprint.

Cite this