FlexRAG is a lightweight model designed to reduce RAG running costs while improving its generation quality. It compresses the retrieved contexts into compact embeddings and these embeddings are optimized to enhance downstream RAG performance. A key feature of FlexRAG is its flexibility, which enables effective support for diverse compression ratios and selective preservation of important contexts.
The evaluation dataset for FlexRAG is released here. Please download and unzip them to the data
folder.
You can install the necessary dependencies using the following command. Recommended Python version is 3.10+.
pip install -r requirements.txt
The entire experiment scripts are included at the experiments
directory. For example, to evaluate on Long-sequence Multi-doc QA dataset:
Vanilla RAG:
cd FlexRAG
bash experiments/eval/eval_longbench_base.sh
FlexRAG w/o Selective Compression:
bash experiments/eval/eval_longbench_flexrag_wo_sc.sh
FlexRAG w. Selective Compression:
bash experiments/eval/eval_longbench_flexrag_embedding.sh
The final evaluation results will be stored in the data/longbench
directory.
If you find this repository useful, please consider giving a star ⭐ and citation
@article{liu2024lighter,
title={Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation},
author={Liu, Zheng and Wu, Chenyuan and Shao, Ninglu and Xiao, Shitao and Li, Chaozhuo and Lian, Defu},
journal={arXiv preprint arXiv:2409.15699},
year={2024}
}