TY - GEN
T1 - Logistic regression with variables subject to post randomization method
AU - Woo, Yong Ming Jeffrey
AU - Slavković, Aleksandra B.
N1 - Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2012
Y1 - 2012
N2 - The Post Randomization Method (PRAM) is a disclosure avoidance method, where values of categorical variables are perturbed via some known probability mechanism, and only the perturbed data are released thus raising issues regarding disclosure risk and data utility. In this paper, we develop and implement a number of EM algorithms to obtain unbiased estimates of the logistic regression model with data subject to PRAM, and thus effectively account for the effects of PRAM and preserve data utility. Three different cases are considered: (1) covariates subject to PRAM, (2) response variable subject to PRAM, and (3) both covariates and response variables subject to PRAM. The proposed techniques improve on current methodology by increasing the applicability of PRAM to a wider range of products and could be extended to other type of generalized linear models. The effects of the level of perturbation and sample size on the estimates are evaluated, and relevant standard error estimates are developed and reported.
AB - The Post Randomization Method (PRAM) is a disclosure avoidance method, where values of categorical variables are perturbed via some known probability mechanism, and only the perturbed data are released thus raising issues regarding disclosure risk and data utility. In this paper, we develop and implement a number of EM algorithms to obtain unbiased estimates of the logistic regression model with data subject to PRAM, and thus effectively account for the effects of PRAM and preserve data utility. Three different cases are considered: (1) covariates subject to PRAM, (2) response variable subject to PRAM, and (3) both covariates and response variables subject to PRAM. The proposed techniques improve on current methodology by increasing the applicability of PRAM to a wider range of products and could be extended to other type of generalized linear models. The effects of the level of perturbation and sample size on the estimates are evaluated, and relevant standard error estimates are developed and reported.
UR - http://www.scopus.com/inward/record.url?scp=84867509153&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867509153&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33627-0_10
DO - 10.1007/978-3-642-33627-0_10
M3 - Conference contribution
AN - SCOPUS:84867509153
SN - 9783642336263
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 116
EP - 130
BT - Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings
PB - Springer Verlag
T2 - International Conference on Privacy in Statistical Databases, PSD 2012
Y2 - 26 September 2012 through 28 September 2012
ER -