On the convergence of formally diverging neural net-based classifiers

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

We present an analytical study of gradient descent algorithms applied to a classification problem in machine learning based on artificial neural networks. Our approach is based on entropy–entropy dissipation estimates that yield explicit rates. Specifically, as long as the neural nets remain within a set of “good classifiers” we establish a striking feature of the algorithm: it mathematically diverges as the number of gradient descent iterations (“time”) goes to infinity but this divergence is only logarithmic, while the loss function vanishes polynomially. As a consequence, this algorithm still yields a classifier that exhibits good numerical performance and may even appear to converge.

Original languageEnglish (US)
Pages (from-to)395-405
Number of pages11
JournalComptes Rendus Mathematique
Volume356
Issue number4
DOIs
StatePublished - Apr 2018

All Science Journal Classification (ASJC) codes

  • General Mathematics

Fingerprint

Dive into the research topics of 'On the convergence of formally diverging neural net-based classifiers'. Together they form a unique fingerprint.

Cite this