This repository contains code to reproduce the experiments in the paper: ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models. For reproducing the accuracy figures, please go to llm-interpret. For reproducing the performance figures, please go to Dejavu.
If you find our work useful, please consider citing using the following:
@misc{akhauri2024shadowllmpredictorbasedcontextualsparsity,
title={ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models},
author={Yash Akhauri and Ahmed F AbouElhamayed and Jordan Dotzel and Zhiru Zhang and Alexander M Rush and Safeen Huda and Mohamed S Abdelfattah},
year={2024},
eprint={2406.16635},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2406.16635},
}