3260 papers • 126 benchmarks • 313 datasets
Adversarial Text refers to a specialised text sequence that is designed specifically to influence the prediction of a language model. Generally, Adversarial Text attack are carried out on Large Language Models (LLMs). Research on understanding different adversarial approaches can help us build effective defense mechanisms to detect malicious text input and build robust language models.
(Image credit: Papersgraph)
These leaderboards are used to track progress in adversarial-text-32
No benchmarks available.
Use these libraries to find adversarial-text-32 models and implementations
No subtasks available.
A novel deep architecture and GAN formulation is developed to effectively bridge advances in text and image modeling, translating visual concepts from characters to pixels.
TextFooler is presented, a simple but strong baseline to generate adversarial text that outperforms previous attacks by success rate and perturbation rate, and is utility-preserving and efficient, which generates adversarialtext with computational complexity linear to the text length.
T3 generated adversarial texts can successfully manipulate the NLP models to output the targeted incorrect answer without misleading the human and have high transferability which enables the black-box attacks in practice.
A decision-based attack strategy that crafts high quality adversarial examples on text classification and entailment tasks is proposed that leverages population-based optimization algorithm to craft plausible and semantically similar adversarialExamples by observing only the top label predicted by the target model.
A novel algorithm is presented, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input.
This work presents BAE, a powerful black box attack for generating grammatically correct and semantically coherent adversarial examples, and shows that BAE performs a stronger attack on three widely used models for seven text classification datasets.
TextAttack, a Python framework for adversarial attacks, data augmentation, and adversarial training in NLP, is introduced and is democratizing NLP: anyone can tryData augmentation and adversaria training on any model or dataset, with just a few lines of code.
This work takes on the challenging task of learning to synthesise speech from normalised text or phonemes in an end-to-end manner, resulting in models which operate directly on character or phoneme input sequences and produce raw speech audio outputs.
This work studies the behavior of several black-box search algorithms used for generating adversarial examples for natural language processing (NLP) tasks and performs a fine-grained analysis of three elements relevant to search: search algorithm, search space, and search budget.
A Bigram and Unigram-based adaptive adaptive Semantic Preservation Optimization (BU-SPO) approach which attacks text documents not only at a unigram word level but also at a bigram level to avoid generating meaningless sentences is devised.
Adding a benchmark result helps the community track progress.