TY - GEN
T1 - Learning to quantize deep neural networks
T2 - 57th ACM/IEEE Design Automation Conference, DAC 2020
AU - Khan, Md Fahim Faysal
AU - Kamani, Mohammad Mahdi
AU - Mahdavi, Mehrdad
AU - Narayanan, Vijaykrishnan
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Reducing the model size and computation costs for dedicated AI accelerator designs, neural network quantization methods have attracted momentous attention recently. Unfortunately, merely minimizing quantization loss using constant discretization causes accuracy deterioration. In this paper, we propose an iterative accuracy-driven learning framework of competitive-collaborative quantization (CCQ) to gradually adapt the bit-precision of each individual layer. Orthogonal to prior quantization policies working with full precision for the first and last layers of the network, CCQ offers layer-wise competition for any target quantization policy with holistic layer fine-tuning to recover accuracy, where the state-of-the-art networks can be entirely quantized without any significant accuracy degradation.
AB - Reducing the model size and computation costs for dedicated AI accelerator designs, neural network quantization methods have attracted momentous attention recently. Unfortunately, merely minimizing quantization loss using constant discretization causes accuracy deterioration. In this paper, we propose an iterative accuracy-driven learning framework of competitive-collaborative quantization (CCQ) to gradually adapt the bit-precision of each individual layer. Orthogonal to prior quantization policies working with full precision for the first and last layers of the network, CCQ offers layer-wise competition for any target quantization policy with holistic layer fine-tuning to recover accuracy, where the state-of-the-art networks can be entirely quantized without any significant accuracy degradation.
UR - http://www.scopus.com/inward/record.url?scp=85084187227&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084187227&partnerID=8YFLogxK
U2 - 10.1109/DAC18072.2020.9218576
DO - 10.1109/DAC18072.2020.9218576
M3 - Conference contribution
AN - SCOPUS:85084187227
T3 - Proceedings - Design Automation Conference
BT - 2020 57th ACM/IEEE Design Automation Conference, DAC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 July 2020 through 24 July 2020
ER -