Skip to content
Change the repository type filter

All

    Repositories list

    • ArtPrompt

      Public
      Official Repo of ACL 2024 Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`
      Python
      MIT License
      124600Updated Nov 2, 2024Nov 2, 2024
    • magpie

      Public
      Python
      MIT License
      55000Updated Sep 5, 2024Sep 5, 2024
    • Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
      Jupyter Notebook
      MIT License
      910101Updated Jul 19, 2024Jul 19, 2024
    • CleanGen

      Public
      Official Implementation of CLEANGEN: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
      Python
      1500Updated Jul 5, 2024Jul 5, 2024
    • ChatBug

      Public
      Official Repo of Paper `ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates`
      Python
      MIT License
      0600Updated Jun 24, 2024Jun 24, 2024
    • edc

      Public
      Source Code for "EDC: Effective and Efficient Dialog Comprehension For Dialog State Tracking" (NAACL 2024)
      Python
      0010Updated Jun 18, 2024Jun 18, 2024
    • ACE

      Public
      Official Repository for ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning
      MIT License
      1100Updated May 21, 2024May 21, 2024