TY - GEN
T1 - Data-driven first-order methods for misspecified convex optimization problems
T2 - 2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014
AU - Ahmadi, Hesam
AU - Shanbhag, Uday V.
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014
Y1 - 2014
N2 - We consider a misspecified optimization problem that requires minimizing of a convex function f(x; θ) in x over a closed and convex set X where θ∗ is an unknown vector of parameters. Suppose θ∗ may be learnt by a parallel learning process that generates a sequence of estimators θk, each of which is an increasingly accurate approximation of θ. In this context, we examine the development of coupled schemes that generate iterates (xk, θk) such that as the iteration index k → ∞, then xk → x, a minimizer of f(x; θ) over X and θk → θ.We make two sets of contributions along this direction. First, we consider the use of gradient methods and proceed to show that such techniques are globally convergent. In addition, such schemes show a quantifiable degradation in the linear rate of convergence observed for strongly convex optimization problems. When strong convexity assumptions are weakened, we see a modification in the convergence rate in function values of O(1/K) by an additive factor θ0-θO(qKg +1/K) where θ0-θ represents the initial misspecification in θ∗ and qg denotes the contractive factor associated with the learning process. Second, we present an averaging-based subgradient scheme and show that the optimal constant steplength leads to a modification in the rate by θ0-θO(qKg +1/K), implying no effect on the standard rate of O(1/√K).
AB - We consider a misspecified optimization problem that requires minimizing of a convex function f(x; θ) in x over a closed and convex set X where θ∗ is an unknown vector of parameters. Suppose θ∗ may be learnt by a parallel learning process that generates a sequence of estimators θk, each of which is an increasingly accurate approximation of θ. In this context, we examine the development of coupled schemes that generate iterates (xk, θk) such that as the iteration index k → ∞, then xk → x, a minimizer of f(x; θ) over X and θk → θ.We make two sets of contributions along this direction. First, we consider the use of gradient methods and proceed to show that such techniques are globally convergent. In addition, such schemes show a quantifiable degradation in the linear rate of convergence observed for strongly convex optimization problems. When strong convexity assumptions are weakened, we see a modification in the convergence rate in function values of O(1/K) by an additive factor θ0-θO(qKg +1/K) where θ0-θ represents the initial misspecification in θ∗ and qg denotes the contractive factor associated with the learning process. Second, we present an averaging-based subgradient scheme and show that the optimal constant steplength leads to a modification in the rate by θ0-θO(qKg +1/K), implying no effect on the standard rate of O(1/√K).
UR - https://www.scopus.com/pages/publications/84931846719
UR - https://www.scopus.com/pages/publications/84931846719#tab=citedBy
U2 - 10.1109/CDC.2014.7040048
DO - 10.1109/CDC.2014.7040048
M3 - Conference contribution
AN - SCOPUS:84931846719
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 4228
EP - 4233
BT - 53rd IEEE Conference on Decision and Control,CDC 2014
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 15 December 2014 through 17 December 2014
ER -