Trading regret for efficiency: Online convex optimization with long term constraints

Mehrdad Mahdavi, Rong Jin, Tianbao Yang

Research output: Contribution to journalArticlepeer-review

163 Scopus citations


In this paper we propose efficient algorithms for solving constrained online convex optimization problems. Our motivation stems from the observation that most algorithms proposed for online convex optimization require a projection onto the convex set K from which the decisions are made. While the projection is straightforward for simple shapes (e.g., Euclidean ball), for arbitrary complex sets it is the main computational challenge and may be inefficient in practice. In this paper, we consider an alternative online convex optimization problem. Instead of requiring that decisions belong to K for all rounds, we only require that the constraints, which define the set K , be satisfied in the long run. By turning the problem into an online convex-concave optimization problem, we propose an efficient algorithm which achieves O(√T) regret bound and O(T3/4) bound on the violation of constraints. Then, we modify the algorithm in order to guarantee that the constraints are satisfied in the long run. This gain is achieved at the price of getting O(T3/4) regret bound. Our second algorithm is based on the mirror prox method (Nemirovski, 2005) to solve variational inequalities which achieves O(T2/3) bound for both regret and the violation of constraints when the domain K can be described by a finite number of linear constraints. Finally, we extend the results to the setting where we only have partial access to the convex set K and propose a multipoint bandit feedback algorithm with the same bounds in expectation as our first algorithm.

Original languageEnglish (US)
Pages (from-to)2503-2528
Number of pages26
JournalJournal of Machine Learning Research
StatePublished - Sep 2012

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence
  • Control and Systems Engineering
  • Statistics and Probability


Dive into the research topics of 'Trading regret for efficiency: Online convex optimization with long term constraints'. Together they form a unique fingerprint.

Cite this