TY - JOUR
T1 - Comparing Different Membership Inference Attacks With a Comprehensive Benchmark
AU - Niu, Jun
AU - Zhu, Xiaoyan
AU - Zeng, Moxuan
AU - Zhang, Ge
AU - Zhao, Qingyang
AU - Huang, Chunhui
AU - Zhang, Yangming
AU - An, Suyu
AU - Wang, Yangzhong
AU - Yue, Xinghui
AU - He, Zhipeng
AU - Guo, Weihao
AU - Shen, Kuo
AU - Liu, Peng
AU - Zhang, Lan
AU - Ma, Jianfeng
AU - Zhang, Yuqing
N1 - Publisher Copyright:
© 2005-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Membership inference (MI) attacks pose a significant threat to user privacy in machine learning systems. While numerous attack mechanisms have been proposed in the literature, the lack of standardized evaluation parameters and metrics has led to inconsistent and even conflicting comparison results. To address this issue and facilitate a systematic analysis of these disparate findings, we introduce MIBench, a comprehensive benchmark that includes a suite of carefully designed evaluation scenarios (ESs) and evaluation metrics to provide a consistent framework for assessing the efficacy of various MI techniques. The ESs are crafted to encompass four critical factors: intra-dataset distance distribution, inter-sample distance within the target dataset, differential distance analysis, and inference withholding ratio. In total, MIBench includes ten typical evaluation metrics and incorporates 84 distinct ESs for each dataset. Using MIBench, we conducted a thorough comparative analysis of 15 state-of-the-art MI attacks across 588 ESs, seven widely adopted datasets, and seven representative model architectures. Our analysis revealed 83 instances of Conflicting Comparison Results (CCR), providing substantial evidence for the CCR Phenomenon. We identified two CCR types: Type 1 (single-factor) and Type 2 (dual-factor). The distribution of CCR instances across the four critical factors was: inter-sample distance (40.96%), differential distance (37.35%), inference withholding ratio (19.28%), and intra-dataset distance (2.41%).
AB - Membership inference (MI) attacks pose a significant threat to user privacy in machine learning systems. While numerous attack mechanisms have been proposed in the literature, the lack of standardized evaluation parameters and metrics has led to inconsistent and even conflicting comparison results. To address this issue and facilitate a systematic analysis of these disparate findings, we introduce MIBench, a comprehensive benchmark that includes a suite of carefully designed evaluation scenarios (ESs) and evaluation metrics to provide a consistent framework for assessing the efficacy of various MI techniques. The ESs are crafted to encompass four critical factors: intra-dataset distance distribution, inter-sample distance within the target dataset, differential distance analysis, and inference withholding ratio. In total, MIBench includes ten typical evaluation metrics and incorporates 84 distinct ESs for each dataset. Using MIBench, we conducted a thorough comparative analysis of 15 state-of-the-art MI attacks across 588 ESs, seven widely adopted datasets, and seven representative model architectures. Our analysis revealed 83 instances of Conflicting Comparison Results (CCR), providing substantial evidence for the CCR Phenomenon. We identified two CCR types: Type 1 (single-factor) and Type 2 (dual-factor). The distribution of CCR instances across the four critical factors was: inter-sample distance (40.96%), differential distance (37.35%), inference withholding ratio (19.28%), and intra-dataset distance (2.41%).
UR - https://www.scopus.com/pages/publications/86000738783
UR - https://www.scopus.com/inward/citedby.url?scp=86000738783&partnerID=8YFLogxK
U2 - 10.1109/TIFS.2025.3550070
DO - 10.1109/TIFS.2025.3550070
M3 - Article
AN - SCOPUS:86000738783
SN - 1556-6013
VL - 20
SP - 6592
EP - 6606
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
ER -