Robust bayesian inverse reinforcement learning with sparse behavior noise

Jiangchuan Zheng, Siyuan Liu, Lionel M. Ni

Research output: Chapter in Book/Report/Conference proceedingConference contribution

53 Scopus citations

Abstract

Inverse reinforcement learning (1RL) aims to recover the reward function underlying a Markov Decision Process from behaviors of experts in support of decision-making. Most recent work on IRL assumes the same level of trustworthiness of all expert behaviors, and frames IRL as a process of seeking reward function that makes those behaviors appear (near)-optimal. However, it is common in reality that noisy expert behaviors disobeying the optimal policy exist, which may degrade the IRL performance significantly. To address this issue, in this paper, we develop a robust IRL framework that can accurately estimate the reward function in the presence of behavior noise. In particular, we focus on a special type of behavior noise referred to as sparse noise due to its wide popularity in real-world behavior data. To model such noise, we introduce a novel latent variable characterizing the reliability of each expert action and use Laplace distribution as its prior. We then devise an EM algorithm with a novel variational inference procedure in the E-step, which can automatically identify and remove behavior noise in reward learning. Experiments on both synthetic data and real vehicle routing data with noticeable behavior noise show significant improvement of our method over previous approaches in learning accuracy, and also show its power in de-noising behavior data.

Original languageEnglish (US)
Title of host publicationProceedings of the National Conference on Artificial Intelligence
PublisherAI Access Foundation
Pages2198-2205
Number of pages8
ISBN (Electronic)9781577356790
StatePublished - 2014
Event28th AAAI Conference on Artificial Intelligence, AAAI 2014, 26th Innovative Applications of Artificial Intelligence Conference, IAAI 2014 and the 5th Symposium on Educational Advances in Artificial Intelligence, EAAI 2014 - Quebec City, Canada
Duration: Jul 27 2014Jul 31 2014

Publication series

NameProceedings of the National Conference on Artificial Intelligence
Volume3

Other

Other28th AAAI Conference on Artificial Intelligence, AAAI 2014, 26th Innovative Applications of Artificial Intelligence Conference, IAAI 2014 and the 5th Symposium on Educational Advances in Artificial Intelligence, EAAI 2014
Country/TerritoryCanada
CityQuebec City
Period7/27/147/31/14

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Robust bayesian inverse reinforcement learning with sparse behavior noise'. Together they form a unique fingerprint.

Cite this