Smoothed Variable Sample-Size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs

Afrooz Jalilzadeh, Uday Shanbhag, Jose Blanchet, Peter W. Glynn

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is μ-strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, aris-ing in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau enve-lope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of O(1=ɛ) with an iteration complexity of O(log(1=ɛ)) in inexact (outer) gradients of F to achieve an ɛ-accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccel-erated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇x F(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of O(1=ɛ); (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of O(1=ɛ2) to obtain an ɛ-optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of O(1=ɛ), matching the bound provided by Nesterov in 2005 for producing ɛ-optimal solutions. Finally, (sVS-APM) and (VS-APM) pro-duce sequences that converge almost surely to a solution of the original problem.

Original languageEnglish (US)
Pages (from-to)373-410
Number of pages38
JournalStochastic Systems
Volume12
Issue number4
DOIs
StatePublished - Dec 2022

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modeling and Simulation
  • Statistics, Probability and Uncertainty
  • Management Science and Operations Research

Fingerprint

Dive into the research topics of 'Smoothed Variable Sample-Size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs'. Together they form a unique fingerprint.

Cite this