Skip to content

dreadnode/research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dreadnode Research

This is a general repository to hold research, projects, reference code, etc. for research we perform at dreadnode.

Mistral - Adversarial Suffix

Implementation of "Universal and Transferable Adversarial Attacks on Aligned Language Models" for Mistral 7B.

Mistral - BEAST Beam Attack

Implementation of "Fast Adversarial Attacks on Language Models In One GPU Minute" for Mistral 7B. At the time of release the authors have not posted the reference code from the paper, so this implementation is likely incorrect.

Llama PGD

Implementation of "Attacking Large Language Models with Projected Gradient Descent" for Llama model variants with LitGPT. At teh time of release the authors have not posted any reference code, so be careful.

Needle Triage/Fix

Research in partnership with OpenSSF for the AIxCC Event.

About

General research for Dreadnode

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published