This repository is the official PyTorch implementation of the ICML'24 paper:
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
Author List: Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu.
Adversarial training (AT) trains models using adversarial examples (AEs), which are natural images modified with specific perturbations to mislead the model.
These perturbations are constrained by a predefined perturbation budget
We find that fundamental discrepancies exist among different pixel regions. Specifically, we segment each image into four equal-sized regions (i.e., ul, short for upper left; ur, short for upper right; br, short for bottom right; bl, short for bottom left) and adversarially train two ResNet-18 on CIFAR-10 using standard AT with the same experiment settings except for the allocation of
Compared to AT, PART leverages the power of CAM methods to identify important pixel regions. Based on the class activation map, we element-wisely multiply a mask to the perturbation to keep the perturbation budget
- This codebase is written for
python3
andpytorch
. - To install necessay python packages, run
pip install -r requirements.txt
.
- Please download and place the dataset into the 'data' directory.
python3 train_eval_part.py
python3 train_eval_part_t.py
python3 train_eval_part_m.py
- This README is formatted based on the NeurIPS guideline.
- Feel free to post any issues via Github.
@inproceedings{zhang2024improving,
title={Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training},
author={Jiacheng Zhang and Feng Liu and Dawei Zhou and Jingfeng Zhang and Tongliang Liu},
booktitle={International Conference on Machine Learning (ICML)},
year={2024}
}