Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

© 2026 Papersgraph. All rights reserved.

adversarial-8

Adversarial Attack

3260 papers • 126 benchmarks • 313 datasets

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes. Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in adversarial-attack-16

Trend

Dataset

Best Model

Actions

CIFAR-10

CIFAR-10

WSJ0-2mix

WSJ0-2mix

Libraries

i

Use these libraries to find adversarial-attack-16 models and implementations

Trustworthy-AI-Group/TransferAttack

17 papers 140

Datasets

CIFAR-10

WSJ0-2mix

ImageNet-P

NAS-Bench-1Shot1

NAS-Bench-1Shot1

comma 2k19

TCAB

Subtasks

Backdoor Attack Adversarial Text Adversarial Attack Detection Real-World Adversarial Attack

Most implemented papers

Towards Deep Learning Models Resistant to Adversarial Attacks

Ludwig Schmidt, Dimitris Tsipras, A. Ma̧dry, Aleksandar Makelov, Adrian Vladu•Sun Jun 18 2017

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

13538

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

jeromerony/adversarial-library

6 papers 134

openai/cleverhans

3 papers 6,082

tensorflow/cleverhans

3 papers 6,082

cleverhans-lab/cleverhans

3 papers 6,081

QData/TextAttack

3 papers 2,760

AngusG/cleverhans-attacking-bnns

2 papers 22

Cifar10Mnist

Cifar10Mnist

PointDenoisingBenchmark

PointDenoisingBenchmark

REAP

0

Towards Evaluating the Robustness of Neural Networks

Nicholas Carlini, D. Wagner•Mon Aug 15 2016

It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.

9496 0

Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

Jonas Rauber, Yinpeng Dong, I. Goodfellow, Nicolas Papernot, P. Mcdaniel, Tom B. Brown, Vahid Behzadan, Aurko Roy, Alexey Kurakin, Yash Sharma, Cihang Xie, David Berthelot, Jonathan Uesato, Fartash Faghri, Nicholas Carlini, Reuben Feinman, Alexander Matyasko, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, W. Gierke, P. Hendricks, Rujun Long•Sun Oct 02 2016

The core functionalities of the CleverHans library are presented, namely the attacks based on adversarial examples and defenses to improve the robustness of machine learning models to these attacks.

537 0

The Limitations of Deep Learning in Adversarial Settings

Nicolas Papernot, P. Mcdaniel, S. Jha, Matt Fredrikson, Z. B. Celik, A. Swami•Mon Nov 23 2015

This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.

4147 0

Universal and Transferable Adversarial Attacks on Aligned Language Models

J. Z. Kolter, Matt Fredrikson, Andy Zou, Zifan Wang•Wed Jul 26 2023

This work significantly advances the state-of-the-art in adversarial attacks against aligned language models, raising important questions about how such systems can be prevented from producing objectionable information.

2436 0

Deep Variational Information Bottleneck

Joshua V. Dillon, Alexander A. Alemi, Ian Fischer•Sat Dec 31 2016

It is shown that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.

2018 0

Provable defenses against adversarial examples via the convex outer adversarial polytope

Eric Wong, J. Z. Kolter•Wed Nov 01 2017

A method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations, and it is shown that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss.

1575 0

Theoretically Principled Trade-off between Robustness and Accuracy

E. Xing, Michael I. Jordan, Hongyang Zhang, Yaodong Yu, Jiantao Jiao, L. Ghaoui•Wed Jan 23 2019

The prediction error for adversarial examples (robust error) is decompose as the sum of the natural (classification) error and boundary error, and a differentiable upper bound is provided using the theory of classification-calibrated loss, which is shown to be the tightest possible upper bound uniform over all probability distributions and measurable predictors.

2877 0

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

Mengnan Du, Fan Yang, Haofan Wang, Zifan Wang, Zijian Zhang, Sirui Ding, Piotr (Peter) Mardziel, Xia Hu•Wed Oct 02 2019

This paper develops a novel post-hoc visual explanation method called Score-CAM based on class activation mapping that outperforms previous methods on both recognition and localization tasks, it also passes the sanity check.

1316 0

Boosting Adversarial Attacks with Momentum

Hang Su, Yinpeng Dong, Jun Zhu, Xiaolin Hu, Jianguo Li, Fangzhou Liao, Tianyu Pang•Mon Oct 16 2017

Deep neural networks are vulnerable to adversarial examples, which poses security concerns on these algorithms due to the potentially severe consequences. Adversarial attacks serve as an important surrogate to evaluate the robustness of deep learning models before they are deployed. However, most of existing adversarial attacks can only fool a black-box model with a low success rate. To address this issue, we propose a broad class of momentum-based iterative algorithms to boost adversarial attacks. By integrating the momentum term into the iterative process for attacks, our methods can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples. To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks. We hope that the proposed methods will serve as a benchmark for evaluating the robustness of various deep models and defense methods. With this method, we won the first places in NIPS 2017 Non-targeted Adversarial Attack and Targeted Adversarial Attack competitions.

3128 0

Adding a benchmark result helps the community track progress.

Adversarial Attack | State-of-the-Art