TY - GEN
T1 - Evaluating and enhancing the robustness of dialogue systems
T2 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019
AU - Cheng, Minhao
AU - Wei, Wei
AU - Hsieh, Cho Jui
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2019
Y1 - 2019
N2 - Recent research has demonstrated that goal-oriented dialogue agents trained on large datasets can achieve striking performance when interacting with human users. In real world applications, however, it is important to ensure that the agent performs smoothly interacting with not only regular users but also those malicious ones who would attack the system through interactions in order to achieve goals for their own advantage. In this paper, we develop algorithms to evaluate the robustness of a dialogue agent by carefully designed attacks using adversarial agents. Those attacks are performed in both black-box and white-box settings. Furthermore, we demonstrate that adversarial training using our attacks can significantly improve the robustness of a goal-oriented dialogue system. On a case-study of the negotiation agent developed by (Lewis et al., 2017), our attacks reduced the average advantage of rewards between the attacker and the trained RL-based agent from 2.68 to −5.76 on a scale from −10 to 10 for randomized goals. Moreover, with the proposed adversarial training, we are able to improve the robustness of negotiation agents by 1.5 points on average against all our attacks.
AB - Recent research has demonstrated that goal-oriented dialogue agents trained on large datasets can achieve striking performance when interacting with human users. In real world applications, however, it is important to ensure that the agent performs smoothly interacting with not only regular users but also those malicious ones who would attack the system through interactions in order to achieve goals for their own advantage. In this paper, we develop algorithms to evaluate the robustness of a dialogue agent by carefully designed attacks using adversarial agents. Those attacks are performed in both black-box and white-box settings. Furthermore, we demonstrate that adversarial training using our attacks can significantly improve the robustness of a goal-oriented dialogue system. On a case-study of the negotiation agent developed by (Lewis et al., 2017), our attacks reduced the average advantage of rewards between the attacker and the trained RL-based agent from 2.68 to −5.76 on a scale from −10 to 10 for randomized goals. Moreover, with the proposed adversarial training, we are able to improve the robustness of negotiation agents by 1.5 points on average against all our attacks.
UR - http://www.scopus.com/inward/record.url?scp=85084306892&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084306892&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85084306892
T3 - NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
SP - 3325
EP - 3335
BT - Long and Short Papers
PB - Association for Computational Linguistics (ACL)
Y2 - 2 June 2019 through 7 June 2019
ER -