Revisiting Normalized Gradient Descent: Fast Evasion of Saddle Points

Ryan Murray, Brian Swenson, Soummya Kar

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

The paper considers normalized gradient descent (NGD), a natural modification of classical gradient descent (GD) in optimization problems. It is shown that, contrary to GD, NGD escapes saddle points 'quickly.' A serious shortcoming of GD in nonconvex problems is that it can take arbitrarily long to escape from the neighborhood of a saddle point. In practice, this issue can significantly slow the convergence of GD, particularly in high-dimensional nonconvex problems. The paper focuses on continuous-time dynamics. It is shown that 1) NGD 'almost never' converges to saddle points and 2) the time required for NGD to escape from a ball of radius r about a saddle point x$ is at most 5κr, where κ is the condition number of the Hessian of f at x. As a simple application of these results, a global convergence-time bound is established for NGD under mild assumptions.

Original languageEnglish (US)
Article number8706530
Pages (from-to)4818-4824
Number of pages7
JournalIEEE Transactions on Automatic Control
Volume64
Issue number11
DOIs
StatePublished - Nov 2019

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this