Note
Repository under active construction
PullingAce.mp4
PullingAce is a Python library designed to benchmark adversarial attacks on Hugging Face models. Built on top of TextAttack and Garak, PullingAce incorporates a set of recipes and attacks to assess the robustness of various natural language processing models in classification and generation taks. This tool provides a comprehensive evaluation of model vulnerabilities and helps researchers and practitioners in the field of machine learning understand the strengths and weaknesses of different models.
-
Adversarial Attack Benchmarks: PullingAce provides a collection of adversarial attack benchmarks tailored for Hugging Face models. Evaluate model robustness against state-of-the-art attacks.
-
Incorporating TextAttack Recipes: PullingAce integrates TextAttack's powerful attack recipes, making it easy to experiment with different attack strategies and customize evaluations.
pullingace --attack tomato --model "textattack/albert-base-v2-ag-news" --dataset "ag_news" --num-examples 10
-
Prompt Injection: PullingAce integrates the prompt injection feature from the Garak library, allowing for more dynamic and flexible adversarial attacks.
-
Toxicity: PullingAce incorporates Garak's toxicity features, providing additional metrics for evaluating model robustness.
-
Risk Cards: This framework gives a large set of risks that might present in LM deployment. Risks can affect a variety of actors in a variety of ways. The set of risks is large, but not all risks apply in all scenarios - and so not all lmrc probes will be relevant to every system.
-
Leak Replay : Integrates probes for evaluating if a model will replay training data.
Creates an html report that analyzes the attack
# Replace with a specific command for prompt injection
pullingace prompt_injection --model_type huggingface --model_name "amazon/MistralLite" --probes HijackHateHumans
Notes : the probes must fit the PROBE_FAMILIES name defined in subprocessor.
To get started with PullingAce, follow these steps:
- Create virtual environment
- Clone the repository:
git clone https://github.com/<username>/pullingace.git
- Navigate to the root directory:
cd pullingace
- Install the package using pip:
pip install .
## Try
pullingace --attack tomato --model "textattack/albert-base-v2-ag-news" --dataset "ag_news" --num-examples 5
This package was created with Cookiecutter and the sourcery-ai/python-best-practices-cookiecutter project template.
This is right now intented to be a python library Uninstall the Previous Version: If you have a previous version of the package installed, you can uninstall it first to avoid conflicts. You can use the following command for that:
HAPPY PATH
pipenv install
pip3 install .
pip3 install garak (SEE WHY)
Tested currently in main with pullingace prompt_injection --model_type huggingface --model_name "amazon/MistralLite" --probes promptinject
pip uninstall pullingace
Replace pullingace with the name of your package.
Increment the Version Number: If you've made changes that you want to distribute, it's a good practice to increment the version number in your setup.py file.
setup(
name='your_package_name',
version='0.2', # Increment this number
# ...
)
Clear Old Build Directories: Remove old build artifacts to make sure you're starting fresh. Navigate to the folder containing setup.py and run:
Copy code
rm -rf build dist pulling_ace.egg-info
Rebuild the Package: Navigate to the folder where your setup.py file is located and run:
pip install .
Check Installation: You can check if the package is installed correctly by running:
Copy code
pip list
# Install dependencies
pipenv install --dev
# Setup pre-commit and pre-push hooks
pipenv run pre-commit install -t pre-commit
pipenv run pre-commit install -t pre-push