Abstract
In this paper, we investigate a model where a defender and an attacker simultaneously and repeatedly adjust the defenses and attacks. Under this model, we propose two iterative reinforcement learning algorithms which allow the defender to identify optimal defenses when the information about the attacker is limited. With probability one, the adaptive reinforcement learning algorithm converges to the best response with respect to the attacks when the attacker diminishingly explores the system. With a probability arbitrarily close to one, the robust reinforcement learning algorithm converges to the min-max strategy despite that the attacker persistently explores the system. The algorithm convergence is formally proven and the algorithm performance is verified via numerical simulations.
Original language | English (US) |
---|---|
Pages (from-to) | 51-58 |
Number of pages | 8 |
Journal | Proceedings of the ACM Conference on Computer and Communications Security |
Volume | 2014-November |
Issue number | November |
DOIs | |
State | Published - Nov 7 2014 |
Event | 1st ACM Workshop on Moving Target Defense, MTD 2014 - Co-located with 21st ACM Conference on Computer and Communications Security, CCS 2014 - Scottsdale, United States Duration: Nov 3 2014 → … |
All Science Journal Classification (ASJC) codes
- Software
- Computer Networks and Communications