TY - GEN
T1 - Cell bounds in two-way contingency tables based on conditional frequencies
AU - Smucker, Byran
AU - Slavković, Aleksandra B.
N1 - Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2008
Y1 - 2008
N2 - Statistical methods for disclosure limitation (or control) have seen coupling of tools from statistical methodologies and operations research. For the summary and release of data in the form of a contingency table some methods have focused on evaluation of bounds on cell entries in k-way tables given the sets of marginal totals, with less focus on evaluation of disclosure risk given other summaries such as conditional probabilities, that is, tables of rates derived from the observed contingency tables. Narrow intervals - especially for cells with low counts - could pose a privacy risk. In this paper we derive the closed-form solutions for the linear relaxation bounds on cell counts of a two-way contingency table given observed conditional probabilities. We also compute the corresponding sharp integer bounds via integer programming and show that there can be large differences in the width of these bounds, suggesting that using the linear relaxation is often an unacceptable shortcut to estimating the sharp bounds and the disclosure risk.
AB - Statistical methods for disclosure limitation (or control) have seen coupling of tools from statistical methodologies and operations research. For the summary and release of data in the form of a contingency table some methods have focused on evaluation of bounds on cell entries in k-way tables given the sets of marginal totals, with less focus on evaluation of disclosure risk given other summaries such as conditional probabilities, that is, tables of rates derived from the observed contingency tables. Narrow intervals - especially for cells with low counts - could pose a privacy risk. In this paper we derive the closed-form solutions for the linear relaxation bounds on cell counts of a two-way contingency table given observed conditional probabilities. We also compute the corresponding sharp integer bounds via integer programming and show that there can be large differences in the width of these bounds, suggesting that using the linear relaxation is often an unacceptable shortcut to estimating the sharp bounds and the disclosure risk.
UR - http://www.scopus.com/inward/record.url?scp=56749156330&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=56749156330&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-87471-3_6
DO - 10.1007/978-3-540-87471-3_6
M3 - Conference contribution
AN - SCOPUS:56749156330
SN - 3540874704
SN - 9783540874706
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 64
EP - 76
BT - Privacy in Statistical Databases - UNESCO Chair in Data Privacy International Conference, PSD 2008, Proceedings
PB - Springer Verlag
T2 - International Conference on Privacy in Statistical Databases, PSD 2008
Y2 - 24 September 2008 through 26 September 2008
ER -