GemFilter 💎

This is an official PyTorch implementation of the paper with the title Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction.

We propose an algorithm that uses early layers of an LLM as filters to select and compress input tokens, significantly reducing the context length for subsequent processing.

[arXiv paper]

Requirements

The code depends on Huggingface transformer 4.43.3 version.

transformers==4.43.3
flash-attn==2.6.3

Installation

Check your correct PyTorch version.

git clone https://github.com/SalesforceAIResearch/GemFilter.git
cd GemFilter
conda create --name gemfilter python=3.12
conda activate gemfilter
pip install torch torchvision torchaudio 
pip install -r requirements.txt
python setup.py develop

Quick Start

Use GemFilter Method

python needle_eval.py\
 --model hf_model_id\
 --modified gemfilter\ 
 --topk 1024 \
 --ctx_len 32000

Customize Your Models

GemFilter can be easily integrated with any transformer models. You can follow the comment marked with [GemFilter] to construct your own models.

The detailed algorithm of GemFilter is in gem_filter_utils.py and my_generation.py.

Partial Results

Needle-in-a-Haystack

Evaluate on Needle-in-a-Haystack benchmark. See more details here.

LongBench

Evaluate on LongBench benchmark. See more details here.

Running Time and GPU Memory

Citation

If you feel this project is helpful, please consider cite our paper 😊

@article{smn+24,
 title={Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction},
 author={Shi, Zhenmei and Ming, Yifei and Nguyen, Xuan-Phi and Liang, Yingyu and Joty, Shafiq},
 journal={arXiv preprint arXiv:2409.17422},
 year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
eval		eval
my_baseline		my_baseline
my_utils		my_utils
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
needle_eval.py		needle_eval.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GemFilter 💎

Requirements

Installation

Quick Start

Use GemFilter Method

Customize Your Models

Partial Results

Needle-in-a-Haystack

LongBench

Running Time and GPU Memory

Citation

About

Releases

Packages

Contributors 2

Languages

License

SalesforceAIResearch/GemFilter

Folders and files

Latest commit

History

Repository files navigation

GemFilter 💎

Requirements

Installation

Quick Start

Use GemFilter Method

Customize Your Models

Partial Results

Needle-in-a-Haystack

LongBench

Running Time and GPU Memory

Citation

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages