NEURAL NETWORKS WITH TRAINABLE MATRIX ACTIVATION FUNCTIONS

Zhengqi Liu, Shuhao Cao, Yuwen Li, Ludmil Zikatanov

Research output: Contribution to journalArticlepeer-review

Abstract

The training process of neural networks usually optimizes weights and bias parameters of linear transformations, while nonlinear activation functions are prespecified and fixed. This work develops a systematic approach to constructing matrix-valued activation functions whose entries are generalized from rectified linear unit (ReLU). The activation is based on matrix-vector multiplications using only scalar multiplications and comparisons. The proposed activation functions depend on parameters that are trained along with the weights and bias vectors. Neural networks based on this approach are simple and efficient and are shown to be robust in numerical experiments.

Original languageEnglish (US)
Pages (from-to)1-11
Number of pages11
JournalJournal of Machine Learning for Modeling and Computing
Volume6
Issue number2
DOIs
StatePublished - 2025

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Artificial Intelligence
  • Computational Mechanics
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'NEURAL NETWORKS WITH TRAINABLE MATRIX ACTIVATION FUNCTIONS'. Together they form a unique fingerprint.

Cite this