AgentRefine: Enhancing Agent Generalization through Refinement Tuning

Code for the Paper "AgentRefine: Enhancing Agent Generalization through Refinement Tuning".

🔔 If you have any questions or suggestions, please don't hesitate to let us know. You can post an issue on this repository.

Outlines

💥 News 💥

[01/2025] 🔥 Our paper has been accepted by ICLR 2025.
[01/2025] 🔥 We will release our model, inference code in one month!

💡 Overview

We introduce AgentRefine, an agent synthesis framework that enables models to learn from observations within trajectories to correct their own errors. AgentRefine significantly outperforms state-of-the-art agent tuning works in terms of generalization capabilities across diverse agent tasks. Our findings establish a relationship between agent generalization and self-improvement, offering a new paradigm for future research.

📝 Training Data

We provided our training data in HuggingFace:

⭐ We will also provide inference code and model soon! Thanks for waiting!

📊 Results

The performance comparison of AgentRefine and other methods across different families and sizes.(The underlined text indicates that the training data is sampled in the same environment as the task and is considered as held-in evaluation.)

Method	Alfworld		BabyAI		SciWorld		PDDL		Jericho
Method	Success	Progress	Success	Progress	Success	Progress	Success	Progress	Success	Progress
GPT Series
GPT-4o	66.4	79.9	48.2	64.1	40	76.9	61.7	69.8	10.0	34.0
GPT-4o-mini	37.3	65.0	36.6	51.9	23.3	49.8	25.0	49.1	10.0	28.5
LLaMA-3-8B Series
LLaMA-3-8B-Instruct	22.4	46.1	45.5	56.5	7.8	41.1	10.0	38.4	0.0	24.3
AgentGen	29.1	47.6	20.5	35.0	-	-	11.7	23.0	-	-
AgentGym	61.9	76.9	47.3	61.4	18.9	47.5	1.7	16.6	0.0	12.9
Agent-FLAN	67.2	79.7	25.0	35.3	1.1	10.9	8.3	25.5	0.0	10.1
AgentRefine	44.8	63.8	37.5	50.4	14.4	42.6	16.6	37.8	10.0	32.3
Mistral Series
Mistral-7B-Instruct-v0.3	12.4	35.9	36.6	45.8	6.7	24.7	13.3	27.8	0.0	17.3
AgentGym	76.9	86.7	40.2	56.3	15.6	48.3	1.7	7.3	0.0	13.0
Agent-FLAN	77.6	87.6	15.2	21.0	0	6.7	0	3.2	0.0	0.7
AgentRefine	51.4	68.8	25.9	42.4	4.4	22.4	11.7	32.8	5.0	28.8
LLaMA-3-70B Series
LLaMA-3-70B-Instruct	67.2	75.2	48.2	61.8	42.2	75.4	55.0	79.8	25.0	46.4
Agent-FLAN	80.5	86.8	32.1	41.2	5.5	16.4	25.0	53.7	0.0	13.6
AgentRefine	67.2	72.1	44.6	59.7	17.7	46.4	38.3	58.6	15.0	37.2

📖 Citation

Please kindly cite our paper if it helps your research:

@inproceedings{fu2025agentrefine,
  title={AgentRefine: Enhancing Agent Generalization through Refinement Tuning},
  author={Dayuan Fu and Keqing He and Yejie Wang and Wentao Hong and Zhuoma GongQue and Weihao Zeng and Wei Wang and Jingang Wang and Xunliang Cai and Weiran Xu},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=FDimWzmcWn}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
picture		picture
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentRefine: Enhancing Agent Generalization through Refinement Tuning

Outlines

💥 News 💥

💡 Overview

📝 Training Data

📊 Results

📖 Citation

About

Releases

Packages

Contributors 2

Fu-Dayuan/AgentRefine

Folders and files

Latest commit

History

Repository files navigation

AgentRefine: Enhancing Agent Generalization through Refinement Tuning

Outlines

💥 News 💥

💡 Overview

📝 Training Data

📊 Results

📖 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages