README

1. Overview

This is the official repo for our in-progress work, Token-Budget-Aware LLM Reasoning.

Reasoning is crucial for LLMs to perform complex tasks, but methods like Chain-of-Thought (CoT) reasoning often lead to significant token overhead and increased costs. We identify substantial token redundancy in the reasoning process of state-of-the-art LLMs and propose a token-budget-aware reasoning framework. This approach dynamically allocates token budgets based on problem complexity to guide the reasoning process. Experiments demonstrate that our method reduces token usage in CoT reasoning with minimal performance trade-offs, striking a practical balance between efficiency and accuracy.

2. Environment

Please see requirements.txt.

Inference for `Directly Answering` and `Vanilla CoT`

`Directly Answering`

python -u inference.py --data_name GSM8K-Zero --model gpt-4o-mini

`Vanilla CoT`

python -u inference.py --data_name GSM8K-Zero --model gpt-4o-mini --reasoning

Output token costs between Directly Answering and Vanilla CoT

🧰 Search for optimal budget

python -u search_budget.py --do_search --data_name GSM8K-Zero

Output token costs between Vanilla CoT and CoT with optimal searched budget

⚙ TALE

We have introduced three different budget estimation methods in our paper.

TALE with Zero-shot Estimator:

python -u TALE.py --data_name GSM8K-Zero --model gpt-4o-mini

TALE with Regression Estimator and Token-Budget Awareness Internalization via Fine-tuning is on the way!

📃 Note

This project is in progress, and the following implementation is coming soon!

🤝Cite this work

@article{han2024token,  
  title={Token-Budget-Aware LLM Reasoning},  
  author={Han, Tingxu and Wang, Zhenting and Fang, Chunrong and Zhao, Shiyu and Ma, Shiqing and Chen, Zhenyu},  
  journal={arXiv preprint arXiv:2412.18547},  
  year={2024}  
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
evaluator		evaluator
images in text		images in text
llm_datasets		llm_datasets
llm_models		llm_models
utils		utils
README.md		README.md
TALE.py		TALE.py
inference.py		inference.py
requirements.txt		requirements.txt
search_budget.py		search_budget.py
token_elasticity.py		token_elasticity.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

1. Overview

2. Environment

Inference for `Directly Answering` and `Vanilla CoT`

`Directly Answering`

`Vanilla CoT`

Output token costs between Directly Answering and Vanilla CoT

🧰 Search for optimal budget

Output token costs between Vanilla CoT and CoT with optimal searched budget

⚙ TALE

📃 Note

🤝Cite this work

About

Releases

Packages

Contributors 2

Languages

GeniusHTX/TALE

Folders and files

Latest commit

History

Repository files navigation

README

1. Overview

2. Environment

Inference for Directly Answering and Vanilla CoT

Directly Answering

Vanilla CoT

Output token costs between Directly Answering and Vanilla CoT

🧰 Search for optimal budget

Output token costs between Vanilla CoT and CoT with optimal searched budget

⚙ TALE

📃 Note

🤝Cite this work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Inference for `Directly Answering` and `Vanilla CoT`

`Directly Answering`

`Vanilla CoT`

Packages