TY - GEN
T1 - Causal Graph Fuzzing for Fair ML Software Development
AU - Monjezi, Verya
AU - Kumar, Ashish
AU - Tan, Gang
AU - Trivedi, Ashutosh
AU - Tizpaz-Niari, Saeid
N1 - Publisher Copyright:
© 2024 IEEE Computer Society. All rights reserved.
PY - 2024/4/14
Y1 - 2024/4/14
N2 - Machine learning (ML) is increasingly used in high-stakes areas like autonomous driving, finance, and criminal justice. However, it often unintentionally perpetuates biases against marginalized groups. To address this, the software engineering community has developed fairness testing and debugging methods, establishing best practices for fair ML software. These practices focus on training model design, including the selection of sensitive and non-sensitive attributes and hyperparameter configuration. However, the application of these practices across different socio-economic and cultural contexts is challenging, as societal constraints vary. Our study proposes a search-based software engineering approach to evaluate the robustness of these fairness practices. We formulate these practices as the first-order logic properties and search for two neighborhood datasets where the practice satisfies in one dataset, but fail in the other one. Our key observation is that these practices should be general and robust to various uncertainty such as noise, faulty labeling, and demographic shifts. To generate datasets, we sift to the causal graph representations of datasets and apply perturbations over the causal graphs to generate neighborhood datasets. In this short paper, we show our methodology using an example of predicting risks in the car insurance application.
AB - Machine learning (ML) is increasingly used in high-stakes areas like autonomous driving, finance, and criminal justice. However, it often unintentionally perpetuates biases against marginalized groups. To address this, the software engineering community has developed fairness testing and debugging methods, establishing best practices for fair ML software. These practices focus on training model design, including the selection of sensitive and non-sensitive attributes and hyperparameter configuration. However, the application of these practices across different socio-economic and cultural contexts is challenging, as societal constraints vary. Our study proposes a search-based software engineering approach to evaluate the robustness of these fairness practices. We formulate these practices as the first-order logic properties and search for two neighborhood datasets where the practice satisfies in one dataset, but fail in the other one. Our key observation is that these practices should be general and robust to various uncertainty such as noise, faulty labeling, and demographic shifts. To generate datasets, we sift to the causal graph representations of datasets and apply perturbations over the causal graphs to generate neighborhood datasets. In this short paper, we show our methodology using an example of predicting risks in the car insurance application.
UR - http://www.scopus.com/inward/record.url?scp=85194835612&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85194835612&partnerID=8YFLogxK
U2 - 10.1145/3639478.3643530
DO - 10.1145/3639478.3643530
M3 - Conference contribution
AN - SCOPUS:85194835612
T3 - Proceedings - International Conference on Software Engineering
SP - 402
EP - 403
BT - Proceedings - 2024 ACM/IEEE 46th International Conference on Software Engineering
PB - IEEE Computer Society
T2 - 46th International Conference on Software Engineering: Companion, ICSE-Companion 2024
Y2 - 14 April 2024 through 20 April 2024
ER -