CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling

Official repository for the EMNLP 2024 paper CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling, by Yu Bai*, Xiyuan Zou*, Heyan Huang, Sanxing Chen, Marc-Antoine Rondeau, Yang Gao, and Jackie Chi Kit Cheung

*: Equal contribution

How to use

First, set the environment:

conda create -y -n citrus_env python=3.9 cudatoolkit=11.3.1 --override-channels -c conda-forge -c nvidia
conda activate citrus_env
pip install transformers==4.34.0 datasets sentencepiece
pip install accelerate bitsandbytes
pip install jieba fuzzywuzzy rouge
git clone https://github.com/ybai-nlp/CItruS

Next, follow the sample use below:

from CItruS.src.citrus_methods import generate_with_citrus
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("your_model")
model = AutoModelForCausalLM.from_pretrained("your_model") # currently support Llama2, Llama3 and Mistral
device = "Your device"

prompt_context = "Enter your context here"
prompt_instruction = "Enter your instruction here"

state_eviction_config={
    "cache_type": "Specify which state eviction method you want to apply during prefilling", # support standard, instruction_aware_single, instruction_aware_dual
    "k": 768, 
    "chunk_size": 256
}

generation_config = {
    "max_new_tokens": 20,
    "do_sample": False,
    "num_beams": 1,
}

generated_text=generate_with_citrus(model, tokenizer, prompt_context, prompt_instruction, device, state_eviction_config, generation_config)
print(generated_text)

Run CItruS on LongBench datasets

bash run_on_longbench.sh --model_name=meta-llama/Llama-2-7b-chat-hf --dataset_name=qasper --cache_type=instruction_aware_single --chunk_size=256  --k=768

Citation

@misc{2406.12018,
Author = {Yu Bai and Xiyuan Zou and Heyan Huang and Sanxing Chen and Marc-Antoine Rondeau and Yang Gao and Jackie Chi Kit Cheung},
Title = {CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling},
Year = {2024},
Eprint = {arXiv:2406.12018},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!