Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)
datasets==2.15.0
flash-attn==2.3.3
jsonlines==4.0.0
torch==2.0.0
torchvision==0.15.0
transformers==4.35.0
We use an example to show how to use our codes.
We use LongChat-13B as the target LLM, and use Llama-2-7B to initial the compressor parameters. For datasets, we use open-source QA datasets (NaturalQuestions, TrivialQA, HotpotQA) to train our compressor and evaluate it. All datasets can be downloaded from this site.
# train compressor
bash train.sh
# evaluate compressor
bash infer.sh