TY - GEN
T1 - Strategyproof reinforcement learning for online resource allocation
AU - Stein, Sebastian
AU - Ochal, Mateusz
AU - Moisoiu, Ioana Adriana
AU - Gerding, Enrico
AU - Ganti, Raghu
AU - He, Ting
AU - Porta, Tom La
N1 - Publisher Copyright:
© 2020 International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS). All rights reserved.
PY - 2020
Y1 - 2020
N2 - We consider an online resource allocation problem where tasks with specific values, sizes and resource requirements arrive dynamically over time, and have to be either serviced or rejected immediately. Reinforcement learning is a promising approach for this, but existing work on reinforcement learning has neglected that task owners may misreport their task requirements or values strategically when this is to their benefit. To address this, we apply mechanism design and propose a novel mechanism based on reinforcement learning that aims to maximise social welfare, is strategyproof and individually rational (i.e., truthful reporting and participation are incentivised). In experiments, we show that our algorithm achieves results that are typically within 90% of the optimal social welfare, while outperforming approaches that use fixed pricing (by up to 86% in specific cases).
AB - We consider an online resource allocation problem where tasks with specific values, sizes and resource requirements arrive dynamically over time, and have to be either serviced or rejected immediately. Reinforcement learning is a promising approach for this, but existing work on reinforcement learning has neglected that task owners may misreport their task requirements or values strategically when this is to their benefit. To address this, we apply mechanism design and propose a novel mechanism based on reinforcement learning that aims to maximise social welfare, is strategyproof and individually rational (i.e., truthful reporting and participation are incentivised). In experiments, we show that our algorithm achieves results that are typically within 90% of the optimal social welfare, while outperforming approaches that use fixed pricing (by up to 86% in specific cases).
UR - http://www.scopus.com/inward/record.url?scp=85096673364&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096673364&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85096673364
T3 - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
SP - 1296
EP - 1304
BT - Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020
A2 - An, Bo
A2 - El Fallah Seghrouchni, Amal
A2 - Sukthankar, Gita
PB - International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
T2 - 19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020
Y2 - 19 May 2020
ER -