- Our paper is now available on ArXiv, please refer to this url. (3/27/2024)
You can find more details about PCToolkit in our technical report.
Prompt compression is an innovative method for efficiently condensing input prompts while preserving essential information. To facilitate quick-start services, user-friendly interfaces, and compatibility with common datasets and metrics, we present the Prompt Compression Toolkit (PCToolkit). This toolkit is a unified plug-and-play solution for compressing prompts in Large Language Models (LLMs), featuring cutting-edge prompt compressors, diverse datasets, and metrics for comprehensive performance evaluation. PCToolkit boasts a modular design, allowing for easy integration of new datasets and metrics through portable and user-friendly interfaces. In this paper, we outline the key components and functionalities of PCToolkit.
We conducted evaluations of the compressors within PCToolkit across various natural language tasks, including reconstruction, summarization, mathematical problem-solving, question answering, few-shot learning, synthetic tasks, code completion, boolean expressions, multiple choice questions, and lies recognition.
PCToolkit contains:
- 5 compression methods
- 11 datasets
- 5+ metrics
(i) State-of-the-art and reproducible methods. Encompassing a wide array of mainstream compression techniques, PCToolkit offers a unified interface for various compression methods (compressors). Notably, PCToolkit incorporates a total of five distinct compressors, namely Selective Context, LLMLingua, LongLLMLingua, SCRL and Keep it Simple.
(ii) User-friendly interfaces for new compressors, datasets, and metrics. Facilitating portability and ease of adaptation to different environments, the interfaces within PCToolkit are designed to be easily customizable. This flexibility makes PCToolkit suitable for a wide range of environments and tasks.
(iii) Modular design. Featuring a modular structure that simplifies the transition between different methods, datasets, and metrics, PCToolkit is organized into four distinct modules: Compressor, Dataset, Metric and Runner module.
The following table presents an overview of the supported tasks, compressors, and datasets within PCToolkit. Each component are described in detail in our technical report.
Tasks | Supported Compressors | Supported Datasets |
---|---|---|
Reconstruction | SC, LLMLingua, LongLLMLingua, SCRL, KiS | BBC, ShareGPT, Arxiv, GSM8K |
Mathematical problems | SC, LLMLingua, LongLLMLingua, SCRL, KiS | GSM8K, BBH |
Boolean expressions | SC, LLMLingua, LongLLMLingua, SCRL, KiS | BBH |
Multiple choice | SC, LLMLingua, LongLLMLingua, SCRL, KiS | BBH |
Lies recognition | SC, LLMLingua, LongLLMLingua, SCRL, KiS | BBH |
Summarization | SC, LLMLingua, LongLLMLingua, SCRL, KiS | BBC, Arxiv, Gigaword, DUC2004, BNC, Broadcast, Google |
LLMLingua, LongLLMLingua | LongBench | |
Question and Answer | SC, LLMLingua, LongLLMLingua, SCRL, KiS | BBH |
LLMLingua, LongLLMLingua | LongBench | |
Few-shot learning | LLMLingua, LongLLMLingua | LongBench |
Synthetic tasks | LLMLingua, LongLLMLingua | LongBench |
Code completion | LLMLingua, LongLLMLingua | LongBench |
git clone https://github.com/3DAgentWorld/Toolkit-for-Prompt-Compression.git
cd Toolkit-for-Prompt-Compression
Locate to the current folder and run:
pip install -r requirements.txt
Due to the file sizes, we do not expect this repository to be too large. Thus, please download the models manually. Most of the models can be automatically downloaded from Huggingface Hub. However, you should at least download models for SCRL
method manually. Just follow the guidance inside /models
folder.
For prompt compression tasks, follow pctoolkit/compressors.py
, you can modify the compression methods as well as the parameters for them. There is an example in pctoolkit/compressors.py
, it will be easy to modify.
Or you can follow the code below:
from pctoolkit.compressors import PromptCompressor
compressor = PromptCompressor(type='SCCompressor', device='cuda')
test_prompt = "test prompt"
ratio = 0.3
result = compressor.compressgo(test_prompt, ratio)
print(result)
For evaluation, follow pctoolkit_demo.py
. Please note that if you want to change the metrics, modify pctoolkit/metrics.py, especially for LongBench dataset.
from pctoolkit.runners import run
from pctoolkit.datasets import load_dataset
from pctoolkit.metrics import load_metrics
from pctoolkit.compressors import PromptCompressor
compressor = PromptCompressor(type='SCCompressor', device='cuda')
dataset_name = 'arxiv'
dataset = load_dataset(dataset_name)
run(compressor=compressor, dataset=dataset, metrics=load_metrics, ratio=0.1)
Hint: Please do remember to fill in your Huggingface Tokens and API keys for OpenAI in pctoolkit/runners.py. (You can also change the URLs if you are using other APIs for OpenAI)
-
Li, Yucheng et al. “Compressing Context to Enhance Inference Efficiency of Large Language Models.” Conference on Empirical Methods in Natural Language Processing (2023).
-
Jiang, Huiqiang et al. “LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models.” Conference on Empirical Methods in Natural Language Processing (2023).
-
Jiang, Huiqiang et al. “LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression.” ArXiv abs/2310.06839 (2023): n. pag.
-
Ghalandari, Demian Gholipour et al. “Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning.” ArXiv abs/2205.08221 (2022): n. pag.
-
Keep It Simple: Unsupervised Simplification of Multi-Paragraph Text (Laban et al., ACL-IJCNLP 2021)
If PCToolkit is used in your research or applications, please cite it using the following BibTeX:
@misc{li2024pctoolkit,
title={PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models},
author={Jinyi Li and Yihuai Lan and Lei Wang and Hao Wang},
year={2024},
eprint={2403.17411},
archivePrefix={arXiv},
primaryClass={cs.CL}
}