TY - JOUR
T1 - Sensitivity-Based Error Resilient Techniques with Heterogeneous Multiply-Accumulate Unit for Voltage Scalable Deep Neural Network Accelerators
AU - Shin, Dongyeob
AU - Choi, Wonseok
AU - Park, Jongsun
AU - Ghosh, Swaroop
N1 - Funding Information:
Manuscript received April 29, 2019; revised July 4, 2019; accepted July 27, 2019. Date of publication August 8, 2019; date of current version September 17, 2019. This work was supported in part by the Ministry of Science and ICT (MSIT), South Korea, through the Information Technology Research Center (ITRC) Support Program supervised by the Institute for Information & Communications Technology Promotion (IITP) under Grant IITP-2019-2018-0-01433, in part by the Industrial Strategic Technology Development Program (Development of SoC Technology Based on Spiking Neural Cell for Smart Mobile and IoT Devices) under Grant 10077445, and in part by the Information Technology Research and Development Program of the Korea Evaluation Institute of Industrial Technology (Design Technology Development of Ultralow Voltage Operating Circuit and IP for Smart Sensor SoC) under Grant 10052716. This article was recommended by Guest Editor M. Ziegler. (Corresponding author: Jongsun Park.) D. Shin, W. Choi, and J. Park are with the School of Electrical Engineering, Korea University, Seoul 136-701, South Korea (e-mail: [email protected]; [email protected]; [email protected]).
Publisher Copyright:
© 2011 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - With inherent algorithmic error resilience of deep neural networks (DNNs), supply voltage scaling could be a promising technique for energy efficient DNN accelerator design. In this paper, we present an error resilient technique to enable aggressive voltage scaling by exploiting the asymmetric error resilience (sensitivity) with respect to DNN layers, filters, and channels. First-order Taylor expansion is used to evaluate the filter/channel-level weight sensitivities of large scale DNNs which accurately approximates weight sensitivities from actual error injection simulations. We also present the heterogeneous multiply-accumulate (MAC) unit based design approach where some of the MAC units are designed larger with shorter critical path delays for robustness to aggressive voltage scaling while other MAC units are designed relatively smaller. The sensitivity variations among filter weights can be leveraged to design DNN accelerator such that the computations with more sensitive weights are assigned to more robust (larger) MAC units while the computations with less sensitive weights are assigned to less robust (smaller) MAC units. Using dynamic programming, the sizes of MAC units are selected to achieve best DNN accuracy under ISO area constraint. As a result, the proposed voltage scalable DNN accelerator can achieve 34% energy savings in post layout simulations using 65 nm CMOS process with ImageNet dataset using ResNet-18 compared to state-of-the-art timing error recovery technique.
AB - With inherent algorithmic error resilience of deep neural networks (DNNs), supply voltage scaling could be a promising technique for energy efficient DNN accelerator design. In this paper, we present an error resilient technique to enable aggressive voltage scaling by exploiting the asymmetric error resilience (sensitivity) with respect to DNN layers, filters, and channels. First-order Taylor expansion is used to evaluate the filter/channel-level weight sensitivities of large scale DNNs which accurately approximates weight sensitivities from actual error injection simulations. We also present the heterogeneous multiply-accumulate (MAC) unit based design approach where some of the MAC units are designed larger with shorter critical path delays for robustness to aggressive voltage scaling while other MAC units are designed relatively smaller. The sensitivity variations among filter weights can be leveraged to design DNN accelerator such that the computations with more sensitive weights are assigned to more robust (larger) MAC units while the computations with less sensitive weights are assigned to less robust (smaller) MAC units. Using dynamic programming, the sizes of MAC units are selected to achieve best DNN accuracy under ISO area constraint. As a result, the proposed voltage scalable DNN accelerator can achieve 34% energy savings in post layout simulations using 65 nm CMOS process with ImageNet dataset using ResNet-18 compared to state-of-the-art timing error recovery technique.
UR - http://www.scopus.com/inward/record.url?scp=85070722599&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85070722599&partnerID=8YFLogxK
U2 - 10.1109/JETCAS.2019.2933862
DO - 10.1109/JETCAS.2019.2933862
M3 - Article
AN - SCOPUS:85070722599
SN - 2156-3357
VL - 9
SP - 520
EP - 531
JO - IEEE Journal on Emerging and Selected Topics in Circuits and Systems
JF - IEEE Journal on Emerging and Selected Topics in Circuits and Systems
IS - 3
M1 - 8792195
ER -