TY - GEN
T1 - A distributed joint-learning and auction algorithm for target assignment
AU - Sadikhov, Teymur
AU - Zhu, Minghui
AU - Martínez, Sonia
PY - 2010
Y1 - 2010
N2 - We consider an agent-target assignment problem in an unknown environment modeled as an undirected graph. Agents incur cost or reward while traveling on the edges of this graph. Agents do not know the graph or the locations of the targets on it. However, they can obtain local information about these by local sensing and communicating with other agents within a limited range. To solve this problem, we come up with a new distributed algorithm that integrates Q-Learning and a distributed auction. The Q-Learning part helps estimate the assignment benefits calculated by summing up rewards over the graph edges for each agent-target pair, while the auction part takes care of assigning agents to targets in a distributed fashion. The algorithm is shown to terminate with a near-optimal assignment in a finite time. Optimality refers to the assignment benefit maximization, which can depend on a target-agent pair value, and the routing cost of the agent to visit the target.
AB - We consider an agent-target assignment problem in an unknown environment modeled as an undirected graph. Agents incur cost or reward while traveling on the edges of this graph. Agents do not know the graph or the locations of the targets on it. However, they can obtain local information about these by local sensing and communicating with other agents within a limited range. To solve this problem, we come up with a new distributed algorithm that integrates Q-Learning and a distributed auction. The Q-Learning part helps estimate the assignment benefits calculated by summing up rewards over the graph edges for each agent-target pair, while the auction part takes care of assigning agents to targets in a distributed fashion. The algorithm is shown to terminate with a near-optimal assignment in a finite time. Optimality refers to the assignment benefit maximization, which can depend on a target-agent pair value, and the routing cost of the agent to visit the target.
UR - http://www.scopus.com/inward/record.url?scp=79953150786&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79953150786&partnerID=8YFLogxK
U2 - 10.1109/CDC.2010.5718180
DO - 10.1109/CDC.2010.5718180
M3 - Conference contribution
AN - SCOPUS:79953150786
SN - 9781424477456
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 5450
EP - 5455
BT - 2010 49th IEEE Conference on Decision and Control, CDC 2010
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 49th IEEE Conference on Decision and Control, CDC 2010
Y2 - 15 December 2010 through 17 December 2010
ER -