Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training (ICML 2024)

This repository is the official PyTorch implementation of the ICML'24 paper:

Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training

Author List: Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu.

Abstract

Adversarial training (AT) trains models using adversarial examples (AEs), which are natural images modified with specific perturbations to mislead the model. These perturbations are constrained by a predefined perturbation budget $\epsilon$ and are equally applied to each pixel within an image. However, in this paper, we discover that not all pixels contribute equally to the accuracy on AEs (i.e., robustness) and accuracy on natural images (i.e., accuracy). Motivated by this finding, we propose Pixel-reweighted AdveRsarial Training (PART), a new framework that partially reduces $\epsilon$ for less influential pixels, guiding the model to focus more on key regions that affect its outputs. Specifically, we first use class activation mapping (CAM) methods to identify important pixel regions, then we keep the perturbation budget for these regions while lowering it for the remaining regions when generating AEs. In the end, we use these pixel-reweighted AEs to train a model. PART achieves a notable improvement in accuracy without compromising robustness on CIFAR-10, SVHN and TinyImagenet-200, justifying the necessity to allocate distinct weights to different pixel regions in robust classification.

Figure 1: The proof-of-concept experiment.

We find that fundamental discrepancies exist among different pixel regions. Specifically, we segment each image into four equal-sized regions (i.e., ul, short for upper left; ur, short for upper right; br, short for bottom right; bl, short for bottom left) and adversarially train two ResNet-18 on CIFAR-10 using standard AT with the same experiment settings except for the allocation of $\epsilon$. The robustness is evaluated by $\ell_{\infty}$-norm PGD-20. With the same overall perturbation budgets (i.e., allocate one of the regions to $6/255$ and others to $12/255$), we find that both natural accuracy and adversarial robustness change significantly if the regional allocation on $\epsilon$ is different. For example, by changing $\epsilon_{\rm{br}} = 6/255$ to $\epsilon_{\rm{ul}} = 6/255$, accuracy gains a 1.23% improvement and robustness gains a 0.94% improvement.

Figure 2: The illustration of our method.

Compared to AT, PART leverages the power of CAM methods to identify important pixel regions. Based on the class activation map, we element-wisely multiply a mask to the perturbation to keep the perturbation budget $\epsilon$ for important pixel regions while shrinking it to $\epsilon^{\rm low}$ for their counterparts during the generation process of AEs.

Requirement

This codebase is written for python3 and pytorch.
To install necessay python packages, run pip install -r requirements.txt.

Data

Please download and place the dataset into the 'data' directory.

Run Experiments

Train and Evaluate PART

python3 train_eval_part.py

Train and Evaluate PART-T

python3 train_eval_part_t.py

Train and Evaluate PART-M

python3 train_eval_part_m.py

License and Contributing

This README is formatted based on the NeurIPS guideline.
Feel free to post any issues via Github.

Citation

@inproceedings{zhang2024improving,
    title={Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training}, 
    author={Jiacheng Zhang and Feng Liu and Dawei Zhou and Jingfeng Zhang and Tongliang Liu},
    booktitle={International Conference on Machine Learning (ICML)},
    year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
dataset		dataset
images		images
loss		loss
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
craft_ae.py		craft_ae.py
requirements.txt		requirements.txt
train_eval_part.py		train_eval_part.py
train_eval_part_m.py		train_eval_part_m.py
train_eval_part_t.py		train_eval_part_t.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training (ICML 2024)

Abstract

Figure 1: The proof-of-concept experiment.

Figure 2: The illustration of our method.

Requirement

Data

Run Experiments

Train and Evaluate PART

Train and Evaluate PART-T

Train and Evaluate PART-M

License and Contributing

Citation

About

Releases

Packages

Languages

License

JiachengZ01/PART

Folders and files

Latest commit

History

Repository files navigation

Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training (ICML 2024)

Abstract

Figure 1: The proof-of-concept experiment.

Figure 2: The illustration of our method.

Requirement

Data

Run Experiments

Train and Evaluate PART

Train and Evaluate PART-T

Train and Evaluate PART-M

License and Contributing

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages