Skip to content

TrojanLM: Trojaning Language Models for Fun and Profit

Notifications You must be signed in to change notification settings

alps-lab/trojan-lm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Backdoor Attacks against Language Models

Description

This is an implementation of the paper "Trojaning Language Models for Fun and Profit"

Requirements

  • Pytorch
  • Transformers
  • Stanza
Folder Structure
  • toxic_comments: Toxic Comment Classification
  • question_answering: Question Answering
  • text_generation: Text Generation with GPT-2
  • text_infilling: scripts about Context-Aware Generative Model
Context-Aware Generation Model (Checkpoints)

The format of the Transformers' checkpoint can be found here: https://www.dropbox.com/sh/se991tx7cxm0aec/AAAFAuwr4NCLVDVqV26ZESmqa?dl=0]

Citation:

If you use this codebase, please cite our paper:

@proceedings{Zhang:TrojanLM
       author = {{Zhang}, Xinyang and {Zhang}, Zheng and {Ji}, Shouling and {Wang}, Ting},
       title = "{Trojaning Language Models for Fun and Profit}",
       booktitle = {Proceedings of the IEEE European Symposium on Security and Privacy (EuroS&P)},
       year = 2021,
}

About

TrojanLM: Trojaning Language Models for Fun and Profit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published