NATTACK: A STRONG AND UNIVERSAL GAUSSIAN BLACK-BOX ADVERSARIAL ATTACK

Data and model can be found here: data and model.

Please download the data&&model and unzip them to './cifar-data' and './all_models'

Below is Table 1 from our paper, where we show the robustness of each accepted defense to the adversarial examples we can construct:

Defense	Dataset	Distance	Success rate
ADV-TRAIN Madry et al. (2018)	CIFAR	0.031 (linf)	47.9%
ADV-BNN Liu et al. (2019)	CIFAR	0.035 (linf)	75.3%
THERM-ADV Buckman et al. (2018)Madry et al. (2018)	CIFAR	0.031 (linf)	91.2%
CAS-ADV Na et al. (2018)	CIFAR	0.031 (linf)	97.7%
ADV-GAN Wang & Yu (2019)	CIFAR	0.015 (linf)	98.3%
LID Ma et al. (2018)	CIFAR	0.031 (linf)	100.0%
THERM Buckman et al. (2018)	CIFAR	0.031 (linf)	100.0%
SAP Dhillon et al. (2018)	CIFAR	0.031 (linf)	100.0%
RSE Liu et al. (2018)	CIFAR	0.031 (linf)	100.0%
GUIDED DENOISER (Liao et al., 2018)	ImageNet	0.031 (linf)	95.5%
RANDOMIZATION Xie et al. (2018)	ImageNet	0.031 (linf)	96.5%
INPUT-TRANS Guo et al. (2018)	ImageNet	0.005 (l2)	100.0%
PIXEL DEFLECTION Prakash et al. (2018)	ImageNet	0.031 (linf)	100.0%

Paper

Abstract:

Powerful adversarial attack methods are vital for understanding how to construct robust deep neural networks (DNNs) and for thoroughly testing defense techniques. In this paper, we propose a black-box adversarial attack algorithm that can defeat both vanilla DNNs and those generated by various defense techniques developed recently. Instead of searching for an "optimal" adversarial example for a benign input to a targeted DNN, our algorithm finds a probability density distribution over a small region centered around the input, such that a sample drawn from this distribution is likely an adversarial example, without the need of accessing the DNN's internal layers or weights. Our approach is universal as it can successfully attack different neural networks by a single algorithm. It is also strong; according to the testing against 2 vanilla DNNs and 13 defended ones, it outperforms state-of-the-art black-box or white-box attack methods for most test cases. Additionally, our results reveal that adversarial training remains one of the best defense techniques, and the adversarial examples are not as transferable across defended DNNs as them across vanilla DNNs.

Source code

This repository contains our implemenation of the black-box attack algorithm described in our paper, six defense methods (SAP, LID, RANDOMIZATION, INPUT-TRANS, THERM, and THERM-DAV) borrowed from the code of Anish et al. (2018), two defended models (GUIDED DENOISER and PIXEL DEFLECTION) based on the code of Athalye & Carlini, (2018), and two defended models (RSE and CAS-ADV) from the original papers.

Note

Please note the paper is still under double-blind review.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
cascade_adv_training		cascade_adv_training
guided-denoiser		guided-denoiser
inputtransformations		inputtransformations
lid		lid
pixel-deflection		pixel-deflection
randomization		randomization
robustnet		robustnet
sap		sap
therm-adv		therm-adv
wideresnet28		wideresnet28
.gitignore		.gitignore
.pydevproject		.pydevproject
README.md		README.md
confusion.py		confusion.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NATTACK: A STRONG AND UNIVERSAL GAUSSIAN BLACK-BOX ADVERSARIAL ATTACK

Paper

Source code

Note

About

Releases

Packages

Languages

gaussian-attack/Nattack

Folders and files

Latest commit

History

Repository files navigation

NATTACK: A STRONG AND UNIVERSAL GAUSSIAN BLACK-BOX ADVERSARIAL ATTACK

Paper

Source code

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages