TY - GEN
T1 - Constrained Bandit Learning with Switching Costs for Wireless Networks
AU - Steiger, Juaren
AU - Li, Bin
AU - Ji, Bo
AU - Lu, Ning
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Bandits with arm selection constraints and bandits with switching costs have both gained recent attention in wireless networking research. Pessimistic-optimistic algorithms, which combine bandit learning with virtual queues to track the constraints, are commonly employed in the former. Block-based algorithms, where switching is disallowed within a block, are commonly employed in the latter. While efficient algorithms have been developed for both problems, it remains challenging to guarantee low regret and constraint violation in a bandit problem that includes both arm selection constraints and switching costs due to the tight coupling between the two. Here, switching may be necessary to decrease the constraint violation but comes at the cost of increased switching regret. In this paper, we tackle the constrained bandits with switching costs problem, for which we design a block-based pessimistic-optimistic algorithm. We identify three timely wireless networking applications for this framework in edge computing, mobile crowdsensing, and wireless network selection. We also prove that our algorithm achieves sublinear regret and vanishing constraint violation and corroborate these results with synthetic simulations and extensive trace-based simulations in the wireless network selection setting.
AB - Bandits with arm selection constraints and bandits with switching costs have both gained recent attention in wireless networking research. Pessimistic-optimistic algorithms, which combine bandit learning with virtual queues to track the constraints, are commonly employed in the former. Block-based algorithms, where switching is disallowed within a block, are commonly employed in the latter. While efficient algorithms have been developed for both problems, it remains challenging to guarantee low regret and constraint violation in a bandit problem that includes both arm selection constraints and switching costs due to the tight coupling between the two. Here, switching may be necessary to decrease the constraint violation but comes at the cost of increased switching regret. In this paper, we tackle the constrained bandits with switching costs problem, for which we design a block-based pessimistic-optimistic algorithm. We identify three timely wireless networking applications for this framework in edge computing, mobile crowdsensing, and wireless network selection. We also prove that our algorithm achieves sublinear regret and vanishing constraint violation and corroborate these results with synthetic simulations and extensive trace-based simulations in the wireless network selection setting.
UR - https://www.scopus.com/pages/publications/85171624942
UR - https://www.scopus.com/pages/publications/85171624942#tab=citedBy
U2 - 10.1109/INFOCOM53939.2023.10228986
DO - 10.1109/INFOCOM53939.2023.10228986
M3 - Conference contribution
AN - SCOPUS:85171624942
T3 - Proceedings - IEEE INFOCOM
BT - INFOCOM 2023 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 42nd IEEE International Conference on Computer Communications, INFOCOM 2023
Y2 - 17 May 2023 through 20 May 2023
ER -