PullingAce: Benchmarking Robustness for LLM Models

Note

Repository under active construction

Captura de pantalla 2024-01-08 a las 15 07 17

PullingAce.mp4

PullingAce is a Python library designed to benchmark adversarial attacks on Hugging Face models. Built on top of TextAttack and Garak, PullingAce incorporates a set of recipes and attacks to assess the robustness of various natural language processing models in classification and generation taks. This tool provides a comprehensive evaluation of model vulnerabilities and helps researchers and practitioners in the field of machine learning understand the strengths and weaknesses of different models.

Features for Classification

Adversarial Attack Benchmarks: PullingAce provides a collection of adversarial attack benchmarks tailored for Hugging Face models. Evaluate model robustness against state-of-the-art attacks.
Incorporating TextAttack Recipes: PullingAce integrates TextAttack's powerful attack recipes, making it easy to experiment with different attack strategies and customize evaluations.

CLI Example

pullingace --attack tomato --model "textattack/albert-base-v2-ag-news" --dataset "ag_news" --num-examples 10

Features for Generative Models

Prompt Injection: PullingAce integrates the prompt injection feature from the Garak library, allowing for more dynamic and flexible adversarial attacks.
Toxicity: PullingAce incorporates Garak's toxicity features, providing additional metrics for evaluating model robustness.
Risk Cards: This framework gives a large set of risks that might present in LM deployment. Risks can affect a variety of actors in a variety of ways. The set of risks is large, but not all risks apply in all scenarios - and so not all lmrc probes will be relevant to every system.
Leak Replay : Integrates probes for evaluating if a model will replay training data.

Reports

Creates an html report that analyzes the attack

CLI Example

# Replace with a specific command for prompt injection
pullingace prompt_injection --model_type huggingface --model_name "amazon/MistralLite" --probes HijackHateHumans

Notes : the probes must fit the PROBE_FAMILIES name defined in subprocessor.

Installation

To get started with PullingAce, follow these steps:

Create virtual environment
Clone the repository: git clone https://github.com/<username>/pullingace.git
Navigate to the root directory: cd pullingace
Install the package using pip:

pip install .

## Try
pullingace --attack tomato --model "textattack/albert-base-v2-ag-news" --dataset "ag_news" --num-examples 5

This package was created with Cookiecutter and the sourcery-ai/python-best-practices-cookiecutter project template.

Notes for Builds

This is right now intented to be a python library Uninstall the Previous Version: If you have a previous version of the package installed, you can uninstall it first to avoid conflicts. You can use the following command for that:

HAPPY PATH

pipenv install
pip3 install .
pip3 install garak (SEE WHY)

Tested currently in main with pullingace prompt_injection --model_type huggingface --model_name "amazon/MistralLite" --probes promptinject

pip uninstall pullingace

Replace pullingace with the name of your package.

Increment the Version Number: If you've made changes that you want to distribute, it's a good practice to increment the version number in your setup.py file.

setup(
    name='your_package_name',
    version='0.2',  # Increment this number
    # ...
)

Clear Old Build Directories: Remove old build artifacts to make sure you're starting fresh. Navigate to the folder containing setup.py and run:

Copy code
rm -rf build dist pulling_ace.egg-info

Rebuild the Package: Navigate to the folder where your setup.py file is located and run:

pip install .

Check Installation: You can check if the package is installed correctly by running:

Copy code
pip list

Setup

# Install dependencies
pipenv install --dev

# Setup pre-commit and pre-push hooks
pipenv run pre-commit install -t pre-commit
pipenv run pre-commit install -t pre-push

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github		.github
.trunk		.trunk
docs		docs
pulling_ace		pulling_ace
test		test
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
__init__.py		__init__.py
setup.cfg		setup.cfg
setup.py		setup.py
sweep.yaml		sweep.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PullingAce: Benchmarking Robustness for LLM Models

Features for Classification

CLI Example

Features for Generative Models

Reports

CLI Example

Installation

Notes for Builds

Setup

About

Releases

Packages

Contributors 3

Languages

SoyGema/pulling_ace

Folders and files

Latest commit

History

Repository files navigation

PullingAce: Benchmarking Robustness for LLM Models

Features for Classification

CLI Example

Features for Generative Models

Reports

CLI Example

Installation

Notes for Builds

Setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages