TY - GEN
T1 - LeapAttack
T2 - 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2022
AU - Ye, Muchao
AU - Chen, Jinghui
AU - Miao, Chenglin
AU - Wang, Ting
AU - Ma, Fenglong
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/8/14
Y1 - 2022/8/14
N2 - Generating text adversarial examples in the hard-label setting is a more realistic and challenging black-box adversarial attack problem, whose challenge comes from the fact that gradient cannot be directly calculated from discrete word replacements. Consequently, the effectiveness of gradient-based methods for this problem still awaits improvement. In this paper, we propose a gradient-based optimization method named LeapAttack to craft high-quality text adversarial examples in the hard-label setting. To specify, LeapAttack employs the word embedding space to characterize the semantic deviation between the two words of each perturbed substitution by their difference vector. Facilitated by this expression, LeapAttack gradually updates the perturbation direction and constructs adversarial examples in an iterative round trip: firstly, the gradient is estimated by transforming randomly sampled word candidates to continuous difference vectors after moving the current adversarial example near the decision boundary; secondly, the estimated gradient is mapped back to a new substitution word based on the cosine similarity metric. Extensive experimental results show that in the general case LeapAttack can efficiently generate high-quality text adversarial examples with the highest semantic similarity and the lowest perturbation rate in the hard-label setting.
AB - Generating text adversarial examples in the hard-label setting is a more realistic and challenging black-box adversarial attack problem, whose challenge comes from the fact that gradient cannot be directly calculated from discrete word replacements. Consequently, the effectiveness of gradient-based methods for this problem still awaits improvement. In this paper, we propose a gradient-based optimization method named LeapAttack to craft high-quality text adversarial examples in the hard-label setting. To specify, LeapAttack employs the word embedding space to characterize the semantic deviation between the two words of each perturbed substitution by their difference vector. Facilitated by this expression, LeapAttack gradually updates the perturbation direction and constructs adversarial examples in an iterative round trip: firstly, the gradient is estimated by transforming randomly sampled word candidates to continuous difference vectors after moving the current adversarial example near the decision boundary; secondly, the estimated gradient is mapped back to a new substitution word based on the cosine similarity metric. Extensive experimental results show that in the general case LeapAttack can efficiently generate high-quality text adversarial examples with the highest semantic similarity and the lowest perturbation rate in the hard-label setting.
UR - http://www.scopus.com/inward/record.url?scp=85137148003&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137148003&partnerID=8YFLogxK
U2 - 10.1145/3534678.3539357
DO - 10.1145/3534678.3539357
M3 - Conference contribution
AN - SCOPUS:85137148003
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 2307
EP - 2315
BT - KDD 2022 - Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 14 August 2022 through 18 August 2022
ER -