TY - GEN
T1 - How Do People Rank Multiple Mutant Agents?
AU - Dodge, Jonathan
AU - Anderson, Andrew A.
AU - Olson, Matthew
AU - Dikkala, Rupika
AU - Burnett, Margaret
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/3/22
Y1 - 2022/3/22
N2 - Faced with several AI-powered sequential decision-making systems, how might someone choose on which to rely? For example, imagine car buyer Blair shopping for a self-driving car, or developer Dillon trying to choose an appropriate ML model to use in their application. Their first choice might be infeasible (i.e., too expensive in money or execution time), so they may need to select their second or third choice. To address this question, this paper presents: 1) Explanation Resolution, a quantifiable direct measurement concept; 2) a new XAI empirical task to measure explanations: "the Ranking Task"; and 3) a new strategy for inducing controllable agent variations - Mutant Agent Generation. In support of those main contributions, it also presents 4) novel explanations for sequential decision-making agents; 5) an adaptation to the AAR/AI assessment process; and 6) a qualitative study around these devices with 10 participants to investigate how they performed the Ranking Task on our mutant agents, using our explanations, and structured by AAR/AI. From an XAI researcher perspective, just as mutation testing can be applied to any code, mutant agent generation can be applied to essentially any neural network for which one wants to evaluate an assessment process or explanation type. As to an XAI user's perspective, the participants ranked the agents well overall, but showed the importance of high explanation resolution for close differences between agents. The participants also revealed the importance of supporting a wide diversity of explanation diets and agent "test selection"strategies.
AB - Faced with several AI-powered sequential decision-making systems, how might someone choose on which to rely? For example, imagine car buyer Blair shopping for a self-driving car, or developer Dillon trying to choose an appropriate ML model to use in their application. Their first choice might be infeasible (i.e., too expensive in money or execution time), so they may need to select their second or third choice. To address this question, this paper presents: 1) Explanation Resolution, a quantifiable direct measurement concept; 2) a new XAI empirical task to measure explanations: "the Ranking Task"; and 3) a new strategy for inducing controllable agent variations - Mutant Agent Generation. In support of those main contributions, it also presents 4) novel explanations for sequential decision-making agents; 5) an adaptation to the AAR/AI assessment process; and 6) a qualitative study around these devices with 10 participants to investigate how they performed the Ranking Task on our mutant agents, using our explanations, and structured by AAR/AI. From an XAI researcher perspective, just as mutation testing can be applied to any code, mutant agent generation can be applied to essentially any neural network for which one wants to evaluate an assessment process or explanation type. As to an XAI user's perspective, the participants ranked the agents well overall, but showed the importance of high explanation resolution for close differences between agents. The participants also revealed the importance of supporting a wide diversity of explanation diets and agent "test selection"strategies.
UR - http://www.scopus.com/inward/record.url?scp=85127754207&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127754207&partnerID=8YFLogxK
U2 - 10.1145/3490099.3511115
DO - 10.1145/3490099.3511115
M3 - Conference contribution
AN - SCOPUS:85127754207
T3 - International Conference on Intelligent User Interfaces, Proceedings IUI
SP - 191
EP - 211
BT - 27th International Conference on Intelligent User Interfaces, IUI 2022
PB - Association for Computing Machinery
T2 - 27th International Conference on Intelligent User Interfaces, IUI 2022
Y2 - 22 March 2022 through 25 March 2022
ER -