Asynchronous variance-reduced block schemes for composite non-convex stochastic optimization: block-specific steplengths and adapted batch-sizes

Jinlong Lei, Uday V. Shanbhag

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

This work considers the minimization of a sum of an expectation-valued coordinate-wise smooth nonconvex function and a nonsmooth block-separable convex regularizer. We propose an asynchronous variance-reduced algorithm, where in each iteration, a single block is randomly chosen to update its estimates by a proximal variable sample-size stochastic gradient scheme, while the remaining blocks are kept invariant. Notably, each block employs a steplength relying on its block-specific Lipschitz constant while batch-sizes are updated as a function of the number of times that block is selected. We show that every limit point is a stationary point and establish the ergodic non-asymptotic rate (Formula presented.). Iteration and oracle complexity to obtain an ε-stationary point are shown to be (Formula presented.) and (Formula presented.), respectively. Furthermore, under a proximal Polyak–Łojasiewicz condition with batch sizes increasing at a geometric rate, we prove that the suboptimality diminishes at a geometric rate, the optimal deterministic rate while iteration and oracle complexity to obtain an ε-optimal solution are (Formula presented.) and (Formula presented.) with (Formula presented.). In the single block setting, we obtain the optimal oracle complexity (Formula presented.). Finally, preliminary numerics suggest that the schemes compare well with competitors reliant on global Lipschitz constants.

Original languageEnglish (US)
Pages (from-to)264-294
Number of pages31
JournalOptimization Methods and Software
Volume37
Issue number1
DOIs
StatePublished - 2022

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Optimization
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Asynchronous variance-reduced block schemes for composite non-convex stochastic optimization: block-specific steplengths and adapted batch-sizes'. Together they form a unique fingerprint.

Cite this