TY - GEN
T1 - Adversarial policy training against deep reinforcement learning
AU - Wu, Xian
AU - Guo, Wenbo
AU - Wei, Hua
AU - Xing, Xinyu
N1 - Publisher Copyright:
© 2021 by The USENIX Association. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Reinforcement learning is a set of goal-oriented learning algorithms, through which an agent could learn to behave in an environment, by performing certain actions and observing the reward which it gets from those actions. Integrated with deep neural networks, it becomes deep reinforcement learning, a new paradigm of learning methods. Recently, deep reinforcement learning demonstrates great potential in many applications such as playing video games, mastering GO competition, and even performing autonomous pilot. However, coming together with these great successes is adversarial attacks, in which an adversary could force a well-trained agent to behave abnormally by tampering the input to the agent's policy network or training an adversarial agent to exploit the weakness of the victim. In this work, we show existing adversarial attacks against reinforcement learning either work in an impractical setting or perform less effectively when being launched in a two-agent competitive game. Motivated by this, we propose a new method to train adversarial agents. Technically speaking, our approach extends the Proximal Policy Optimization (PPO) algorithm and then utilizes an explainable AI technique to guide an attacker to train an adversarial agent. In comparison with the adversarial agent trained by the state-of-the-art technique, we show that our adversarial agent exhibits a much stronger capability in exploiting the weakness of victim agents. Besides, we demonstrate that our adversarial attack introduces less variation in the training process and exhibits less sensitivity to the selection of initial states.
AB - Reinforcement learning is a set of goal-oriented learning algorithms, through which an agent could learn to behave in an environment, by performing certain actions and observing the reward which it gets from those actions. Integrated with deep neural networks, it becomes deep reinforcement learning, a new paradigm of learning methods. Recently, deep reinforcement learning demonstrates great potential in many applications such as playing video games, mastering GO competition, and even performing autonomous pilot. However, coming together with these great successes is adversarial attacks, in which an adversary could force a well-trained agent to behave abnormally by tampering the input to the agent's policy network or training an adversarial agent to exploit the weakness of the victim. In this work, we show existing adversarial attacks against reinforcement learning either work in an impractical setting or perform less effectively when being launched in a two-agent competitive game. Motivated by this, we propose a new method to train adversarial agents. Technically speaking, our approach extends the Proximal Policy Optimization (PPO) algorithm and then utilizes an explainable AI technique to guide an attacker to train an adversarial agent. In comparison with the adversarial agent trained by the state-of-the-art technique, we show that our adversarial agent exhibits a much stronger capability in exploiting the weakness of victim agents. Besides, we demonstrate that our adversarial attack introduces less variation in the training process and exhibits less sensitivity to the selection of initial states.
UR - http://www.scopus.com/inward/record.url?scp=85114486573&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85114486573&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85114486573
T3 - Proceedings of the 30th USENIX Security Symposium
SP - 1883
EP - 1900
BT - Proceedings of the 30th USENIX Security Symposium
PB - USENIX Association
T2 - 30th USENIX Security Symposium, USENIX Security 2021
Y2 - 11 August 2021 through 13 August 2021
ER -