Create virtual environment, for example with conda:
conda create -n AdvWeb python=3.12.2
conda activate AdvWeb
Install dependencies:
pip install -r requirements.txt
Clone this repository:
git clone https://github.com/AI-secure/AdvWeb.git
Set up OpenAI API key and other keys to the environment:
(Our pipeline supports attacking various large language models such as GPT, Gemini, and Claude. Here, we take attacking GPT as an example.)
export OPENAI_API_KEY=<YOUR_KEY>
export HUGGING_FACE_HUB_TOKEN=<YOUR_KEY>
We conduct experiments on the Mind2Web dataset and test our approach against the state-of-the-art web agent framework, SeeAct.
Download the source data Multimodal-Mind2Web from Hugging Face and store it in the path data/Multimodal-Mind2Web/data/
.
Download the Seeact Source Data and store it in the path data/seeact_source_data/
.
Run the notebook data_generation.ipynb
to filter data from the source dataset and construct the training set and test set.
Run training_data_generation.sh
to test the quality of the data in the training set and construct datasets for SFT and DPO.
After completing the Data Generation section, your file structure should look like this:
├──task_demo_-1_aug
├──attack_dataset.json
├──subset_test_data_aug
│ ├── train.json
│ ├── test.json
│ ├── augmented_dataset.json
│ ├── predictions
│ │ ├── prediction-4api-augment-data.jsonl
│ │ ├── augmented_dataset_correct.json
│ │ └── prediction-4api-augment-data-correct.jsonl
│ └── imgs
│ └── f5da4b14-026d-4a10-ab89-f5720418f2b4_9016ffb6-7468-4495-ad07-756ac9f2af03.jpg
└── together
└── data
└── sft_train_data.jsonl
We fine-tune the model by calling Together AI's API. The basic training process is as follows (for more instructions, please refer to the Together AI docs):
Set up Together AI API key:
export TOGETHER_API_KEY=<YOUR_KEY>
Upload training dataset:
together files upload "xxx.jsonl"
Train the SFT model:
together fine-tuning create \
--training-file "file-xxx" \
--model "mistralai/Mistral-7B-Instruct-v0.2" \
--lora \
--batch-size 16
Download the SFT model:
together fine-tuning download "ft-xxx"
You can store the SFT model in the path data/task_demo_-1_aug/together/new_models/
.
Run dpo_training.sh
to train the DPO model.
Select the best training model based on the training curve, and run dpo_model_merge.sh
to merge the model.
Run evaluation.sh
to evaluate the SFT and DPO models.
If you find this code useful, please cite our paper: