Logistic regression with variables subject to post randomization method

Yong Ming Jeffrey Woo, Aleksandra B. Slavković

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations


The Post Randomization Method (PRAM) is a disclosure avoidance method, where values of categorical variables are perturbed via some known probability mechanism, and only the perturbed data are released thus raising issues regarding disclosure risk and data utility. In this paper, we develop and implement a number of EM algorithms to obtain unbiased estimates of the logistic regression model with data subject to PRAM, and thus effectively account for the effects of PRAM and preserve data utility. Three different cases are considered: (1) covariates subject to PRAM, (2) response variable subject to PRAM, and (3) both covariates and response variables subject to PRAM. The proposed techniques improve on current methodology by increasing the applicability of PRAM to a wider range of products and could be extended to other type of generalized linear models. The effects of the level of perturbation and sample size on the estimates are evaluated, and relevant standard error estimates are developed and reported.

Original languageEnglish (US)
Title of host publicationPrivacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings
PublisherSpringer Verlag
Number of pages15
ISBN (Print)9783642336263
StatePublished - 2012
EventInternational Conference on Privacy in Statistical Databases, PSD 2012 - Palermo, Italy
Duration: Sep 26 2012Sep 28 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7556 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


OtherInternational Conference on Privacy in Statistical Databases, PSD 2012

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Logistic regression with variables subject to post randomization method'. Together they form a unique fingerprint.

Cite this