TY - GEN
T1 - Private Posterior Inference Consistent with Public Information
T2 - International Conference on Privacy in Statistical Databases, PSD 2020
AU - Seeman, Jeremy
AU - Slavkovic, Aleksandra
AU - Reimherr, Matthew
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Methods for generating differentially-private (DP) synthetic data have received recent attention as large government agencies such as the U.S. Census have decided to release DP synthetic data for public usage. In the synthetic data generation process, it is common to post-process the privatized results so that the final synthetic data agrees with what the data curator considers public information. Our contributions are three fold: 1) we show empirically that using post-processing to incorporate public information in contingency tables can lead to sub-optimal inference, 2) we propose an alternative Bayesian sampling framework that directly incorporates both noise due to DP and public information constraints, leading to improved inference, and 3) we demonstrate the proposed methodology on a study of the relationship between mortality rate and race in small areas given priviatized data from the CDC and U.S. Census.
AB - Methods for generating differentially-private (DP) synthetic data have received recent attention as large government agencies such as the U.S. Census have decided to release DP synthetic data for public usage. In the synthetic data generation process, it is common to post-process the privatized results so that the final synthetic data agrees with what the data curator considers public information. Our contributions are three fold: 1) we show empirically that using post-processing to incorporate public information in contingency tables can lead to sub-optimal inference, 2) we propose an alternative Bayesian sampling framework that directly incorporates both noise due to DP and public information constraints, leading to improved inference, and 3) we demonstrate the proposed methodology on a study of the relationship between mortality rate and race in small areas given priviatized data from the CDC and U.S. Census.
UR - http://www.scopus.com/inward/record.url?scp=85092084828&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092084828&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-57521-2_23
DO - 10.1007/978-3-030-57521-2_23
M3 - Conference contribution
AN - SCOPUS:85092084828
SN - 9783030575205
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 323
EP - 336
BT - Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2020, Proceedings
A2 - Domingo-Ferrer, Josep
A2 - Muralidhar, Krishnamurty
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 September 2020 through 25 September 2020
ER -