META INVERSE CONSTRAINED REINFORCEMENT LEARNING: CONVERGENCE GUARANTEE AND GENERALIZATION ANALYSIS

Shicheng Liu, Minghui Zhu

Research output: Contribution to conferencePaperpeer-review

3 Scopus citations

Abstract

This paper considers the problem of learning the reward function and constraints of an expert from few demonstrations. This problem can be considered as a meta-learning problem where we first learn meta-priors over reward functions and constraints from other distinct but related tasks and then adapt the learned meta-priors to new tasks from only few expert demonstrations. We formulate a bi-level optimization problem where the upper level aims to learn a meta-prior over reward functions and the lower level is to learn a meta-prior over constraints. We propose a novel algorithm to solve this problem and formally guarantee that the algorithm reaches the set of ϵ-stationary points at the iteration complexity O(1/ϵ2). We also quantify the generalization error to an arbitrary new task. Experiments are used to validate that the learned meta-priors can adapt to new tasks with good performance from only few demonstrations.

Original languageEnglish (US)
StatePublished - 2024
Event12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria
Duration: May 7 2024May 11 2024

Conference

Conference12th International Conference on Learning Representations, ICLR 2024
Country/TerritoryAustria
CityHybrid, Vienna
Period5/7/245/11/24

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'META INVERSE CONSTRAINED REINFORCEMENT LEARNING: CONVERGENCE GUARANTEE AND GENERALIZATION ANALYSIS'. Together they form a unique fingerprint.

Cite this