4.6 Article

Leveraging transferability and improved beam search in textual adversarial attacks

Journal

NEUROCOMPUTING
Volume 500, Issue -, Pages 135-142

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2022.05.054

Keywords

Deep neural networks; Natural language processing; Adversarial examples; Textual attacks

Funding

  1. National Key R&D Program of China [2019YFB1706003]
  2. Major Key Project of PCL [PCL2022A03]
  3. Key Program of Zhejiang Provincial Natural Science Foundation of China [LZ22F020007]
  4. Natural Science Foundation of China [61902082]
  5. Guangdong Province Key R&D Program of China [2019B010136003]
  6. Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2019)

Ask authors/readers for more resources

Adversarial attacks in NLP are difficult to defend due to the discrete and abstract nature of human languages. Previous studies have used different word replacement strategies to generate semantically preserved adversarial texts. However, these query-based methods have limited exploration of the search space. This study proposes an improved beam search algorithm and utilizes the transferable vulnerability between models to select vulnerable candidate words. Experimental results show that our method outperforms three advanced attacking methods under black-box settings.
Adversarial attacks in NLP are difficult to ward off because of the discrete and highly abstract nature of human languages. Prior works utilize different word replacement strategies to generate semantic preserving adversarial texts. These query-based methods, however, have limited exploration of the search space. To fully explore the search space, an improved beam search with multiple random perturbing positions is used. Besides, we use the transferable vulnerability from surrogate models to choose vulnerable candidate words for target models. We empirically show that beam search with multiple random attacking positions works better than the commonly used greedy search with word importance ranking. Extensive experiments on three popular datasets demonstrate that our method can outperform three advanced attacking methods under black-box settings. We provide ablation studies to clearly show the effectiveness of our improved beam search which can achieve a higher success rate than the greedy approach under the same query budget.(c) 2022 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available