Global convergence of Langevin dynamics based algorithms for nonconvex optimization

Pan Xu, Difan Zou, Jinghui Chen, Quanquan Gu

Research output: Contribution to journalConference articlepeer-review

96 Scopus citations

Abstract

We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with n component functions. At the core of our analysis is a direct analysis of the ergodicity of the numerical approximations to Langevin dynamics, which leads to faster convergence rates. Specifically, we show that gradient Langevin dynamics (GLD) and stochastic gradient Langevin dynamics (SGLD) converge to the almost minimizer2 within Õe(nd/(λε)) and Õe(d7/(λ5ε5)) stochastic gradient evaluations respectively3, where d is the problem dimension, and λ is the spectral gap of the Markov chain generated by GLD. Both results improve upon the best known gradient complexity4 results [45]. Furthermore, for the first time we prove the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (SVRG-LD) to the almost minimizer within Õe(pnd5/(λ4ε5/2)) stochastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime. Our theoretical analyses shed some light on using Langevin dynamics based algorithms for nonconvex optimization with provable guarantees.

Original languageEnglish (US)
Pages (from-to)3122-3133
Number of pages12
JournalAdvances in Neural Information Processing Systems
Volume2018-December
StatePublished - 2018
Event32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada
Duration: Dec 2 2018Dec 8 2018

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Global convergence of Langevin dynamics based algorithms for nonconvex optimization'. Together they form a unique fingerprint.

Cite this