Journal
NEUROCOMPUTING
Volume 500, Issue -, Pages 135-142Publisher
ELSEVIER
DOI: 10.1016/j.neucom.2022.05.054
Keywords
Deep neural networks; Natural language processing; Adversarial examples; Textual attacks
Categories
Funding
- National Key R&D Program of China [2019YFB1706003]
- Major Key Project of PCL [PCL2022A03]
- Key Program of Zhejiang Provincial Natural Science Foundation of China [LZ22F020007]
- Natural Science Foundation of China [61902082]
- Guangdong Province Key R&D Program of China [2019B010136003]
- Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2019)
Ask authors/readers for more resources
Adversarial attacks in NLP are difficult to defend due to the discrete and abstract nature of human languages. Previous studies have used different word replacement strategies to generate semantically preserved adversarial texts. However, these query-based methods have limited exploration of the search space. This study proposes an improved beam search algorithm and utilizes the transferable vulnerability between models to select vulnerable candidate words. Experimental results show that our method outperforms three advanced attacking methods under black-box settings.
Adversarial attacks in NLP are difficult to ward off because of the discrete and highly abstract nature of human languages. Prior works utilize different word replacement strategies to generate semantic preserving adversarial texts. These query-based methods, however, have limited exploration of the search space. To fully explore the search space, an improved beam search with multiple random perturbing positions is used. Besides, we use the transferable vulnerability from surrogate models to choose vulnerable candidate words for target models. We empirically show that beam search with multiple random attacking positions works better than the commonly used greedy search with word importance ranking. Extensive experiments on three popular datasets demonstrate that our method can outperform three advanced attacking methods under black-box settings. We provide ablation studies to clearly show the effectiveness of our improved beam search which can achieve a higher success rate than the greedy approach under the same query budget.(c) 2022 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available