TY - GEN
T1 - ResiRCA
T2 - 26th IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
AU - Qiu, Keni
AU - Jao, Nicholas
AU - Zhao, Mengying
AU - Mishra, Cyan Subhra
AU - Gudukbay, Gulsum
AU - Jose, Sethu
AU - Sampson, Jack
AU - Kandemir, Mahmut Taylan
AU - Narayanan, Vijaykrishnan
N1 - Funding Information:
IX. ACKNOWLEDGEMENTS This work was supported in part by Semiconductor Research Corporation (SRC), Center for Brain-inspired Computing (C-BRIC), Center for Research in Intelligent Storage and Processing in Memory (CRISP), NSF Grants #1822923 (SPX: SOPHIA), #1763681, #1629915, #1629129, #1317560, #1526750, National Natural Science Foundation of China (NSFC #61872251) and Beijing Advanced Innovation Center for Imaging Technology.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/2
Y1 - 2020/2
N2 - Many recent works have shown substantial efficiency boosts from performing inference tasks on Internet of Things (IoT) nodes rather than merely transmitting raw sensor data. However, such tasks, e.g., convolutional neural networks (CNNs), are very compute intensive. They are therefore challenging to complete at sensing-matched latencies in ultra-low-power and energy-harvesting IoT nodes. ReRAM crossbar-based accelerators (RCAs) are an ideal candidate to perform the dominant multiplication-and-accumulation (MAC) operations in CNNs efficiently, but conventional, performance-oriented RCAs, while energy-efficient, are power hungry and ill-optimized for the intermittent and unstable power supply of energy-harvesting IoT nodes. This paper presents the ResiRCA architecture that integrates a new, lightweight, and configurable RCA suitable for energy harvesting environments as an opportunistically executing augmentation to a baseline sense-and-transmit battery-powered IoT node. To maximize ResiRCA throughput under different power levels, we develop the ResiSchedule approach for dynamic RCA reconfiguration. The proposed approach uses loop tiling-based computation decomposition, model duplication within the RCA, and inter-layer pipelining to reduce RCA activation thresholds and more closely track execution costs with dynamic power income. Experimental results show that ResiRCA together with ResiSchedule achieve average speedups and energy efficiency improvements of 8x and 14x respectively compared to a baseline RCA with intermittency-unaware scheduling.
AB - Many recent works have shown substantial efficiency boosts from performing inference tasks on Internet of Things (IoT) nodes rather than merely transmitting raw sensor data. However, such tasks, e.g., convolutional neural networks (CNNs), are very compute intensive. They are therefore challenging to complete at sensing-matched latencies in ultra-low-power and energy-harvesting IoT nodes. ReRAM crossbar-based accelerators (RCAs) are an ideal candidate to perform the dominant multiplication-and-accumulation (MAC) operations in CNNs efficiently, but conventional, performance-oriented RCAs, while energy-efficient, are power hungry and ill-optimized for the intermittent and unstable power supply of energy-harvesting IoT nodes. This paper presents the ResiRCA architecture that integrates a new, lightweight, and configurable RCA suitable for energy harvesting environments as an opportunistically executing augmentation to a baseline sense-and-transmit battery-powered IoT node. To maximize ResiRCA throughput under different power levels, we develop the ResiSchedule approach for dynamic RCA reconfiguration. The proposed approach uses loop tiling-based computation decomposition, model duplication within the RCA, and inter-layer pipelining to reduce RCA activation thresholds and more closely track execution costs with dynamic power income. Experimental results show that ResiRCA together with ResiSchedule achieve average speedups and energy efficiency improvements of 8x and 14x respectively compared to a baseline RCA with intermittency-unaware scheduling.
UR - http://www.scopus.com/inward/record.url?scp=85084187455&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084187455&partnerID=8YFLogxK
U2 - 10.1109/HPCA47549.2020.00034
DO - 10.1109/HPCA47549.2020.00034
M3 - Conference contribution
AN - SCOPUS:85084187455
T3 - Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
SP - 315
EP - 327
BT - Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 22 February 2020 through 26 February 2020
ER -