TY - GEN
T1 - CNNs with Compact Activation Function
AU - Wang, Jindong
AU - Xu, Jinchao
AU - Zhu, Jianqing
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Activation function plays an important role in neural networks. We propose to use hat activation function, namely the first order B-spline, as activation function for CNNs including MgNet and ResNet. Different from commonly used activation functions like ReLU, the hat function has a compact support and no obvious spectral bias. Although spectral bias is thought to be beneficial for generalization, we show that MgNet and ResNet with hat function still exhibit a slightly better generalization performance than CNNs with ReLU function by our experiments of classification on MNIST, CIFAR10/100 and ImageNet datasets. This indicates that CNNs without spectral bias can have a good generalization capability. We also illustrate that although hat function has a small activation area which is more likely to induce vanishing gradient problem, hat CNNs with various initialization methods still works well.
AB - Activation function plays an important role in neural networks. We propose to use hat activation function, namely the first order B-spline, as activation function for CNNs including MgNet and ResNet. Different from commonly used activation functions like ReLU, the hat function has a compact support and no obvious spectral bias. Although spectral bias is thought to be beneficial for generalization, we show that MgNet and ResNet with hat function still exhibit a slightly better generalization performance than CNNs with ReLU function by our experiments of classification on MNIST, CIFAR10/100 and ImageNet datasets. This indicates that CNNs without spectral bias can have a good generalization capability. We also illustrate that although hat function has a small activation area which is more likely to induce vanishing gradient problem, hat CNNs with various initialization methods still works well.
UR - https://www.scopus.com/pages/publications/85134349788
UR - https://www.scopus.com/pages/publications/85134349788#tab=citedBy
U2 - 10.1007/978-3-031-08754-7_40
DO - 10.1007/978-3-031-08754-7_40
M3 - Conference contribution
AN - SCOPUS:85134349788
SN - 9783031087530
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 319
EP - 327
BT - Computational Science - ICCS 2022, 22nd International Conference, Proceedings
A2 - Groen, Derek
A2 - de Mulatier, Clélia
A2 - Krzhizhanovskaya, Valeria V.
A2 - Sloot, Peter M.A.
A2 - Paszynski, Maciej
A2 - Dongarra, Jack J.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 22nd Annual International Conference on Computational Science, ICCS 2022
Y2 - 21 June 2022 through 23 June 2022
ER -