SaTC: CORE: Frontier: Collaborative: End-to-End Trustworthiness of Machine-Learning Systems

Project: Research project

Project Details


This frontier project establishes the Center for Trustworthy Machine Learning (CTML), a large-scale, multi-institution, multi-disciplinary effort whose goal is to develop scientific understanding of the risks inherent to machine learning, and to develop the tools, metrics, and methods to manage and mitigate them. The center is led by a cross-disciplinary team developing unified theory, algorithms and empirical methods within complex and ever-evolving ML approaches, application domains, and environments. The science and arsenal of defensive techniques emerging within the center will provide the basis for building future systems in a more trustworthy and secure manner, as well as fostering a long term community of research within this essential domain of technology. The center has a number of outreach efforts, including a massive open online course (MOOC) on this topic, an annual conference, and broad-based educational initiatives. The investigators continue their ongoing efforts at broadening participation in computing via a joint summer school on trustworthy ML aimed at underrepresented groups, and by engaging in activities for high school students across the country via a sequence of webinars advertised through the She++ network and other organizations.

The center focuses on three interconnected and parallel investigative directions that represent the different classes of attacks attacking ML systems: inference attacks, training attacks, and abuses of ML. The first direction explores inference time security, namely methods to defend a trained model from adversarial inputs. This effort emphasizes developing formally grounded measurements of robustness against adversarial examples (defenses), as well as understanding the limits and costs of attacks. The second research direction aims to develop rigorously grounded measures of robustness to attacks that corrupt the training data and new training methods that are robust to adversarial manipulation. The final direction tackles the general security implications of sophisticated ML algorithms including the potential abuses of generative ML models, such as models that generate (fake) content, as well as data mechanisms to prevent the theft of a machine learning model by an adversary who interacts with the model.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Effective start/end date10/1/189/30/23


  • National Science Foundation: $5,019,520.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.