Skip to content

Official Repo of paper "QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression".

Notifications You must be signed in to change notification settings

Wenshansilvia/attention_compressor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression

📃 Paper

🔍 Overview

We release QUITO, a powerful context compressor that leverages attention of the question over the contexts to filter useless information.

quito framework

🎯 Quick Start

1. Installation

git clone https://github.com/Wenshansilvia/attention_compressor
cd attention_compressor/
pip install -r requirements.txt

2. Usage

from quito.compressor import Compressor

compressor = Compressor('Qwen/Qwen2-0.5B-Instruct')
# Use Phrase Level Filtering 
compressed_context = compressor.compress(doc="", query="", ratio=0.5)

# Or use Sentence Level Filtering
compressed_context = compressor.compress_sentence(doc="", query="", ratio=0.5)

# Or use Dynamic Sentence Level Filtering
compressed_context = compressor.compress_sentence_token(doc="", query="", ratio=0.5)

📌 Citation

If you find the repository or paper helpful, please cite our work:

@article{
    @misc{wang2024quitoacceleratinglongcontextreasoning,
      title={QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression}, 
      author={Wenshan Wang and Yihang Wang and Yixing Fan and Huaming Liao and Jiafeng Guo},
      year={2024},
      eprint={2408.00274},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.00274}, 
}
}

About

Official Repo of paper "QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages