TY - GEN
T1 - Hierarchical clustering for discrimination discovery
T2 - 2nd IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2019
AU - Nasiriani, Neda
AU - Squicciarini, Anna
AU - Saldanha, Zara
AU - Goel, Sanchit
AU - Zannone, Nicola
PY - 2019/6/1
Y1 - 2019/6/1
N2 - Today, data is an essential part of many decision-making processes in businesses and social life through the use of various machine learning techniques. These methods can easily perpetuate human bias in the data and result in discrimination. Despite a growing interest in data discrimination discovery and removal, to date there is a lack of a general and robust framework to distinguish discriminatory decision-making processes from non-discriminatory ones. In this work, we present a generic framework that helps detect possible discrimination by analyzing historical data and associated decisions using a top-down unsupervised approach, which we refer to as hierarchical clustering. Our approach is highly adaptive as it gradually 'learns' users' inherent groups, and clusters their records using cohesiveness and density of points in the dataset. Moreover, we propose a progressive attribute-selection method to choose statistically relevant attributes, thus reducing the effect of noise. Finally, we adopt a recursive notion of cluster profile that is homogeneous w.r.t. decision labels. This allows for deeper insights on the data and on the decision-making underlying the final user classification. Our framework is able to identify both positive and negative bias resulting in discrimination. We also highlight patterns of discrimination revealed by the homogeneous cluster centroids, which otherwise could not be captured.
AB - Today, data is an essential part of many decision-making processes in businesses and social life through the use of various machine learning techniques. These methods can easily perpetuate human bias in the data and result in discrimination. Despite a growing interest in data discrimination discovery and removal, to date there is a lack of a general and robust framework to distinguish discriminatory decision-making processes from non-discriminatory ones. In this work, we present a generic framework that helps detect possible discrimination by analyzing historical data and associated decisions using a top-down unsupervised approach, which we refer to as hierarchical clustering. Our approach is highly adaptive as it gradually 'learns' users' inherent groups, and clusters their records using cohesiveness and density of points in the dataset. Moreover, we propose a progressive attribute-selection method to choose statistically relevant attributes, thus reducing the effect of noise. Finally, we adopt a recursive notion of cluster profile that is homogeneous w.r.t. decision labels. This allows for deeper insights on the data and on the decision-making underlying the final user classification. Our framework is able to identify both positive and negative bias resulting in discrimination. We also highlight patterns of discrimination revealed by the homogeneous cluster centroids, which otherwise could not be captured.
UR - http://www.scopus.com/inward/record.url?scp=85071510303&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071510303&partnerID=8YFLogxK
U2 - 10.1109/AIKE.2019.00041
DO - 10.1109/AIKE.2019.00041
M3 - Conference contribution
T3 - Proceedings - IEEE 2nd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2019
SP - 187
EP - 194
BT - Proceedings - IEEE 2nd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 3 June 2019 through 5 June 2019
ER -