RI: Small: Secure, Robust, and End-User Driven Prediction Aware Counterfactual Explanations

Project: Research project

Project Details

Description

Machine Learning (ML) has pervaded almost every aspect of current-day society. With ML touching the lives of non-expert users, a critical need for transparency and accountability in ML systems has arisen. For example, using ML systems as part of the hiring process where the people in charge do not know exactly how best potential candidates for a position are picked. This project addresses this issue by focusing on developing counterfactual explanations (CFEs) for ML models, a technique for providing easy-to-understand and actionable explanations (to end-users) for the deployed ML models. Unfortunately, there are still major limitations with current state-of-the-art CFE techniques which have impeded their widespread adoption in real-world contexts. This project focuses on addressing these limitations by developing secure, robust, and user-driven prediction aware CFE techniques for ML models, which are essential for providing actionable recourse to marginalized populations that are negatively affected by algorithmic decisions. The project's broader significance lies in its potential to protect intellectual property, maintain trust by ensuring recourse recommendations remain valid despite model updates, and incorporate stakeholder feedback into an explanatory design. This aligns with NSF's mission to advance knowledge and education in science and engineering, while promoting fair and transparent use of technology. The impact extends to various domains, including agriculture, healthcare, etc., where improved transparency and accountability of ML systems can lead to better decision-making and improved outcomes for underrepresented groups.Counterfactual explanations (CFEs) for Machine Learning (ML) models have received a lot of attention due to their ability to provide actionable recourse to marginalized populations in a wide variety of domains. While most CFE techniques are post-hoc (i.e., they generate explanations for pre-trained black-box ML models), a recent line of research proposes prediction-aware CFEs that make a novel departure from the prevalent post-hoc paradigm. Unfortunately, there are critical research gaps that need to be tackled before the promise of prediction- aware CFEs can be truly realized - (i) These techniques can be exploited by adversaries to extract proprietary ML model details which could lead to intellectual property theft. (ii) These techniques generate explanations which often grow stale as the enterprise continuously updates their proprietary ML model, which prevents enterprises from honoring these stale CFEs. (iii) These techniques do not allow stakeholders to express feedback about the CFEs provided to them. To address these research gaps, we propose to develop a new suite of secure, robust, and end-user driven prediction-aware CFE techniques that increase the usability of prediction-aware CFE systems. Our algorithmic tools and approaches will leverage and build upon techniques from adversarial machine learning, bi-level optimization techniques, deep learning, game theory, and security. Additionally, the project includes developing an interactive smartphone application to facilitate stakeholder interaction with CFEs, enhancing accessibility and usability. The research plan involves rigorous evaluation using benchmark datasets and real-world user studies in collaboration with non-profits to measure the effectiveness of the proposed techniques in practical scenarios.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
StatusActive
Effective start/end date10/1/249/30/27

Funding

  • National Science Foundation: $592,294.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.