This repository is the official implementation of the paper AtPatch: Debugging Transformers via Hot-Fixing Over-Attention
AtPatch is a hot-fix method that dynamically redistributes attention maps during model inference. Specifically, for a given input, AtPatch first extracts the attention map from the model’s inference process. Then, it uses a pre-trained detector to identify anomalous columns and replace them with unified benign attention. Then, AtPatch rescales other columns to mitigate the impact of over-attention. Finally, AtPatch returns the redistributed attention map to the model for continued inference. Notably, if the detector does not report any anomalous columns, AtPatch directly returns the original attention map to the model. Unlike existing techniques, AtPatch selectively redistributes the attention map, making it better at preserving the model's original functionality. Furthermore, AtPatch's on-the-fly nature allows it to work without modifying model parameters or retraining, making it better suited for deployed models.
pip install -r requirements.txt
We have prepared a demo that can run directly on the MNIST dataset. Before that, you need to download the MNIST dataset from its official website and use vit_train.py to train a ViT model that has been backdoor attacked. Finally, you can run the following code to execute AtPatch directly:
python AtPatchTool.py --dataset "mnist" --config "./config/mnist.toml" --ttype "content"Expected output:
Org-ASR: 1.0, Cur-ASR: 0.0, Org-Acc: 0.9842, Cur-Acc: 0.9842
Theoretically, AtPatch is dataset-agnostic and model-agnostic. You can modify the corresponding code to run it on any custom dataset and custom transformer-based model.

