3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in backdoor-defense-10
No benchmarks available.
Use these libraries to find backdoor-defense-10 models and implementations
No datasets available.
A novel Frequency-Injection based Backdoor Attack method (FIBA) that is capable of delivering attacks in various MIA tasks, and preserves the semantics of the poisoned image pixels, and can perform attacks on both classification and dense prediction models.
A simple and effective textual backdoor defense named ONION, which is based on outlier word detection and, to the best of the knowledge, is the first method that can handle all the textual backdoor attack situations.
Recently, machine learning models have demonstrated to be vulnerable to backdoor attacks, primarily due to the lack of transparency in black-box models such as deep neural networks. A third-party model can be poisoned such that it works adequately in normal conditions but behaves maliciously on samples with specific trigger patterns. However, the trigger injection function is manually defined in most existing backdoor attack methods, e.g., placing a small patch of pixels on an image or slightly deforming the image before poisoning the model. This results in a two-stage approach with a sub-optimal attack success rate and a lack of complete stealthiness under human inspection.In this paper, we propose a novel and stealthy backdoor attack framework, LIRA, which jointly learns the optimal, stealthy trigger injection function and poisons the model. We formulate such an objective as a non-convex, constrained optimization problem. Under this optimization framework, the trigger generator function will learn to manipulate the input with imperceptible noise to preserve the model performance on the clean data and maximize the attack success rate on the poisoned data. Then, we solve this challenging optimization problem with an efficient, two-stage stochastic optimization procedure. Finally, the proposed attack framework achieves 100% success rates in several benchmark datasets, including MNIST, CIFAR10, GTSRB, and T-ImageNet, while simultaneously bypassing existing backdoor defense methods and human inspection.
This work proposes a novel backdoor defense via decoupling the original end-to-end training process into three stages and reveals that poisoned samples tend to cluster together in the feature space of the attacked DNN model, which is mostly due to the end- to-end supervised training paradigm.
This paper proposes the use of a universal adversarial trigger as the backdoor trigger to attack video recognition models, a situation where backdoor attacks are likely to be challenged by the above 4 strict conditions.
This work proposes a backdoor-based verification mechanism and demonstrates its effectiveness in certifying data deletion with high confidence using the above framework and makes a novel use of backdoor attacks in ML as a basis for quantitatively inferring machine unlearning.
A novel backdoor defense framework named Attention Relation Graph Distillation (ARGD) is introduced, which fully explores the correlation among attention features with different orders using the proposed Attentionrelation Graphs (ARGs).
A novel two-stage backdoor defense method, named MCLDef, based on Model-Contrastive Learning (MCL), which significantly outperforms state-of-the-art defense methods by up to 95.79% reduction in ASR, while in most cases the BA degradation can be controlled within less than 2%.
This paper proposes a post-training defense that detects backdoor attacks with arbitrary types of backdoor embeddings, without making any assumptions about the backdoor embedding type, and proposes a novel, general approach for backdoor mitigation once a detection is made.
A brand-new backdoor defense strategy, Trap and Replace, which makes it much easier to remove the harmful influence of backdoor samples from the model, and outperforms previous state-of-the-art methods.
Adding a benchmark result helps the community track progress.