Abstract
This paper studies federated linear contextual bandits under the notion of user-level differential privacy (DP). We first introduce a unified federated bandits framework that can accommodate various definitions of DP in the sequential decision-making setting. We then formally introduce user-level central DP (CDP) and local DP (LDP) in the federated bandits framework, and investigate the fundamental trade-offs between the learning regrets and the corresponding DP guarantees in a federated linear contextual bandits model. For CDP, we propose a federated algorithm termed as ROBIN and show that it is near-optimal in terms of the number of clients M and the privacy budget ε by deriving nearly-matching upper and lower regret bounds when user-level DP is satisfied. For LDP, we obtain several lower bounds, indicating that learning under user-level (ε, δ)-LDP must suffer a regret blow-up factor at least min{1/ε, M} or min{1/√ε, √M} under different conditions.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 14060-14095 |
| Number of pages | 36 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 202 |
| State | Published - 2023 |
| Event | 40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States Duration: Jul 23 2023 → Jul 29 2023 |
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Federated Linear Contextual Bandits with User-level Differential Privacy'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver