conda create --name rvsdta python=3.8.13
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
-
Download the word2id.pkl and wordvec.pkl for the synonym model, and put download files into the Word2Vec dir.
-
A script is provided to perform targeted attacks for Stable Diffusion
# Traning for generating the adversarial prompts
python run.py --config_path ./object_config.json # Object attacks
python run.py --config_path ./style_config.json # Style attacks
# Testing for evaluating the attack success rate
python test_object_multi.py --config_path ./object_config.json # Object attack
python test_style_multi.py --config_path ./style_config.json # Style attack
# Testing for evaluating FID score of generated images
python IQA.py --gen_img_path [the root of generated images] --task [object or style] --attack_goal_path [the path of referenced images] --metric image_quality
Config can be loaded from a JSON file.
Config has the following parameters:
add_suffix_num
: the number of suffixes in the word addition perturbation strategy. The default is 5.replace_type
: a list for specifying the word types in the word substitution strategy. The default is ['all'] that represent replace all words except the noun. Optional: ["verb", "adj", "adv", "prep"]synonym_num
: The forbidden number of synonyms. The default is 10.iter
: the total number of iterations. The default is 500.lr
: the learning weight for the optimizer. The default is 0.1weight_decay
: the weight decay for the optimizer.loss_weight
: The weight of MSE loss in style attacks.print_step
: The number of steps to print a line giving current statusbatch_size
: number of referenced images used for each iteration.clip_model
: the name of the CLiP model for use with ."laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
is the model used in SD 2.1.prompt_path
: The path of clean prompt file.task
: The targeted attack task. Optional:"object"
or"style"
forbidden_words
: A txt file for representing the forbidden words for each target goal.target_path
: The file path of referenced images.output_dir
: The path for saving the learned adversarial prompts.
We public our adversarial attack dataset that is used to achieve object attacks on Stable Diffusion. The dataset is available at [Link].
If you find the repo useful, please consider citing.
@article{zhang2024revealing,
title={Revealing Vulnerabilities in Stable Diffusion via Targeted Attacks},
author={Zhang, Chenyu and Wang, Lanjun and Liu, Anan},
journal={arXiv preprint arXiv:2401.08725},
year={2024}
}