3260 papers • 126 benchmarks • 313 datasets
The detection of adversarial attacks.
(Image credit: Papersgraph)
These leaderboards are used to track progress in adversarial-attack-detection-3
No benchmarks available.
Use these libraries to find adversarial-attack-detection-3 models and implementations
No datasets available.
No subtasks available.
It is verified that the MMD test is aware of adversarial attacks, which lights up a novel road for adversarial data detection based on two-sample tests.
It is argued that the alternation of data by AutoAttack with l-inf, eps = 8/255 is unrealistically strong, resulting in close to perfect detection rates of adversarial samples even by simple detection algorithms and human observers, and that other attack methods are much harder to detect while achieving similar success rates.
This paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training, and shows that the appropriate training criterion for Prior Networks is the reverse KL-divergence between Dirichlet distributions.
This work proposes a meta-learning based robust detection method to detect new adversarial attacks with limited examples that consists of a double-network framework: a task-dedicated network and a master network which alternatively learn the detection capability for either seen attack or a new attack.
This work proposes a new type of adversarial attack to Deep Neural Networks for image classification that induces model misclassfication by injecting style changes insensitive for humans, through an optimization procedure.
Argos first amplifies the discrepancies between the visual content of an image and its misclassified label induced by the attack using a set of regeneration mechanisms and then identifies an image as adversarial if the reproduced views deviate to a preset degree.
This paper proposes Segment and Complete defense (SAC), a general framework for defending object detectors against patch attacks through detection and removal of adversarial patches, and presents the APRICOT-Mask dataset, which augments the APRicOT dataset with pixel-level annotations of adversaria patches.
This work proposes a simple sentence-embedding “residue” based detector to identify adversarial examples that out-performs ported image domain detectors and recent state of the art NLP specific detectors.
Adding a benchmark result helps the community track progress.