TY - GEN
T1 - Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning
AU - Liu, Hongbin
AU - Qu, Wenjie
AU - Jia, Jinyuan
AU - Gong, Neil Zhenqiang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Classifiers in supervised learning have various security and privacy issues, e.g., 1) data poisoning attacks, backdoor attacks, and adversarial exampleson the security side as well as 2) inference attacksto the training data on the privacy side. Various secure and privacy-preserving supervised learning algorithms with formal guarantees have been proposed to address these issues. However, they suffer from various limitations such as accuracy loss, small certified security guarantees, and/or inefficiency. Self-supervised learning pre-trains encoders using unlabeled data. Given a pre-trained encoder as a feature extractor, supervised learning can train a simple yet accurate classifier using a small amount of labeled training data. In this work, we perform the first systematic, principled measurement study to understand whether and when a pre-trained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms. Our key findings are that a pre-trained encoder substantially improves 1) both accuracy under no attacks and certified security guarantees against data poisoning and backdoor attacks of state-of-the-art secure learning algorithms (i.e., bagging and KNN), 2) certified security guarantees of randomized smoothing against adversarial examples without sacrificing its accuracy under no attacks, 3) accuracy of differentially private classifiers.
AB - Classifiers in supervised learning have various security and privacy issues, e.g., 1) data poisoning attacks, backdoor attacks, and adversarial exampleson the security side as well as 2) inference attacksto the training data on the privacy side. Various secure and privacy-preserving supervised learning algorithms with formal guarantees have been proposed to address these issues. However, they suffer from various limitations such as accuracy loss, small certified security guarantees, and/or inefficiency. Self-supervised learning pre-trains encoders using unlabeled data. Given a pre-trained encoder as a feature extractor, supervised learning can train a simple yet accurate classifier using a small amount of labeled training data. In this work, we perform the first systematic, principled measurement study to understand whether and when a pre-trained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms. Our key findings are that a pre-trained encoder substantially improves 1) both accuracy under no attacks and certified security guarantees against data poisoning and backdoor attacks of state-of-the-art secure learning algorithms (i.e., bagging and KNN), 2) certified security guarantees of randomized smoothing against adversarial examples without sacrificing its accuracy under no attacks, 3) accuracy of differentially private classifiers.
UR - https://www.scopus.com/pages/publications/85199187780
UR - https://www.scopus.com/pages/publications/85199187780#tab=citedBy
U2 - 10.1109/SPW63631.2024.00019
DO - 10.1109/SPW63631.2024.00019
M3 - Conference contribution
AN - SCOPUS:85199187780
T3 - Proceedings - 45th IEEE Symposium on Security and Privacy Workshops, SPW 2024
SP - 144
EP - 156
BT - Proceedings - 45th IEEE Symposium on Security and Privacy Workshops, SPW 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 45th IEEE Symposium on Security and Privacy Workshops, SPW 2024
Y2 - 23 May 2024
ER -