Skip to content

bambookers/adversarial_mwps_generation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adversarial Math Word Problem Generation

This is the official repository for LLM-Resistant Math Word Problem Generation via Adversarial Attacks. This repository contains the code for generating adversarial math problems, attacking the models, and analyzing the generated adversarial examples.

Installation

Install all required packages using pip install -r requirements.txt.

Input Data Format

The original math problems should be formatted in a json file with "Problem" and "final_ans" fields:

[
    {
        "Problem": "<Problem 0>",
        "final_ans": "<Answer 0>"
        ...
    },
    {
        "Problem": "<Problem 1>",
        "final_ans": "<Answer 1>"
        ...
    },
    ...
]

An input data example from GSM8K is provided on data/example/input_example.json

Generation

Go to generation folder and run main.py to generate adversarial examples. The script will call GPT-4-turbo once (~$0.01) for each original problem to generate python code. Here is an example command for generating 20 adversarial math problems:

python main.py \
  --action generation \
  --output_directory ../generated \
  --original_question_file ../data/example/input_example.json \
  --openai_key_path PATH/TO/OPENAI/APIKEY/IN/TXT \
  --code_directory ../data/example/ode \
  --m 20 

You can specifiy your own generation settings, file structure/path, etc. in config.py. The final generated adversarial examples will be saved in ../generated/adversarial_example.json. We release all the generated adversarial examples for our experiments in data/experimental_data folder.

Attack

To attack the models generated adversarial examples, go to attack folder and run attack.py. An example command for attacking meta-math/MetaMath-70B-V1.0 is:

python attack.py \
  --model meta-math/MetaMath-70B-V1.0 \
  --adversarial_data_path ../generated/adversarial_example.json \
  --main_output_directory_path ../results 

The model outputs are saved in ../results. The script will generate model output summary and save it in model_output_summary and all incorrect results in incorrect_problems.

Analysis

To analyze the generated adversarial examples, run main.py in the generation folder with --action analysis. The script will compare model's incorrect predictions with the ground truth. An example command for analyzing MetaMath-70B-V1.0 is:

python main.py \
  --action analysis \
  --output_directory ../analysis \
  --model_name MetaMath-70B-V1.0 \
  --model_incorrect_file /data/example/incorrect_problems/MetaMath-70B-V1.0.json 

You can specifiy your own analysis settings, show plots, etc. in config.py. The final analysis results will be saved in ../analysis/

Citation

If you find this code useful, we'd appreciate it if you cite the following paper:



About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%