SaTC: CORE: Small: Utilizing Untrusted Machine Learning Services Safely via Privacy-Preserving User Data Embedding

Project: Research project

Project Details

Description

Many Artificial Intelligence (AI)-based services perform valuable tasks based on images contributed by users. For example, AI artists can create an artistic image of a person when given the person's photograph as input. AI-based medical services can look at a patient's X-ray imagery and diagnose a disease. Smart glasses equipped with AI can detect obstacles or hazards to aid the visually impaired. However, these AI-based services may pose risks when the input contains privacy-sensitive information. For example, an AI-based service might learn a user's location by looking at the background of the photo or infer personal characteristics from their X-ray images. This project will develop methods that reduce such privacy risks while still allowing users to enjoy high-quality AI-based services. A key goal is to create methods that include mathematical guarantees, so that the protection is not only safe against existing attacks but robust to potential future attacks. This, in turn, will pave the way for justified trust in image-based systems by both end-users and service providers, even in privacy-critical domains like health, law, and national defense.This project will study how to create an encoding or an embedding (a noisy latent representation of the input), from which a useful downstream task can still be performed, but specified private information of the original input cannot be extracted. The core idea is to use Metric Differential Privacy (Metric DP) in a pixel space to provide a strong privacy guarantee while designing the end-to-end system to maintain high downstream utility. Focusing on image-based applications, the project will address three main research challenges: (1) designing encodings with Metric DP -- simultaneously designing methods to generate privacy-preserving encodings and provide tight privacy guarantees; (2) improving the utility -- developing methods to improve the quality of the model when using privacy-preserving encodings; and (3) designing the overarching system -- improving the end-to-end user experience by developing systematic optimizations like caching and post-processing of the model output. These research directions will be evaluated by a new dataset developed as part of this project, which will be open to the research community and expedite future research efforts on this topic. The project outcome will directly affect closely related fields, such as instance encoding, split learning, and vertical federated learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
StatusActive
Effective start/end date10/1/249/30/27

Funding

  • National Science Foundation: $600,000.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.