TY - GEN
T1 - An Uncertainty Principle is a Price of Privacy-Preserving Microdata
AU - Abowd, John
AU - Ashmead, Robert
AU - Cumings-Menon, Ryan
AU - Garfinkel, Simson
AU - Kifer, Daniel
AU - Leclerc, Philip
AU - Sexton, William
AU - Simpson, Ashley
AU - Task, Christine
AU - Zhuravlev, Pavel
N1 - Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Privacy-protected microdata are often the desired output of a differentially private algorithm since microdata is familiar and convenient for downstream users. However, there is a statistical price for this kind of convenience. We show that an uncertainty principle governs the trade-off between accuracy for a population of interest (“sum query”) vs. accuracy for its component sub-populations (“point queries”). Compared to differentially private query answering systems that are not required to produce microdata, accuracy can degrade by a logarithmic factor. For example, in the case of pure differential privacy, without the microdata requirement, one can provide noisy answers to the sum query and all point queries while guaranteeing that each answer has squared error O(1/∊2). With the microdata requirement, one must choose between allowing an additional log2(d) factor (d is the number of point queries) for some point queries or allowing an extra O(d2) factor for the sum query. We present lower bounds for pure, approximate, and concentrated differential privacy. We propose mitigation strategies and create a collection of benchmark datasets that can be used for public study of this problem.
AB - Privacy-protected microdata are often the desired output of a differentially private algorithm since microdata is familiar and convenient for downstream users. However, there is a statistical price for this kind of convenience. We show that an uncertainty principle governs the trade-off between accuracy for a population of interest (“sum query”) vs. accuracy for its component sub-populations (“point queries”). Compared to differentially private query answering systems that are not required to produce microdata, accuracy can degrade by a logarithmic factor. For example, in the case of pure differential privacy, without the microdata requirement, one can provide noisy answers to the sum query and all point queries while guaranteeing that each answer has squared error O(1/∊2). With the microdata requirement, one must choose between allowing an additional log2(d) factor (d is the number of point queries) for some point queries or allowing an extra O(d2) factor for the sum query. We present lower bounds for pure, approximate, and concentrated differential privacy. We propose mitigation strategies and create a collection of benchmark datasets that can be used for public study of this problem.
UR - http://www.scopus.com/inward/record.url?scp=85129035725&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129035725&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85129035725
T3 - Advances in Neural Information Processing Systems
SP - 11883
EP - 11895
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Y2 - 6 December 2021 through 14 December 2021
ER -