[🌐 Website] • [📜 Paper] • [🤗 HF Models] • [🐱 GitHub]
Repo for "Interactive Evolution: A Neural-Symbolic Self-Training Framework for Large Language Models"
- [2024/07/12] 🚀 The codebase is fixed and completed ! Try it in Branch v0.2 !
- [2024/07/09] A series of checkpoints after self-training with ENVISIONS are released at huggingface ! Cover agent, math and logic domains ! Include 7B and 13B versions ! Check it out !
- [2024/05/20] 🚀🚀🚀 ENVISIONS is under review!
- [2024/05/01] 🔥🔥🔥 We create a new repo for the code of ENVISIONS!
This work is still in progress. You can also check our previous work Symbol-LLM on neural-symbolism. It will appear at ACL 2024 main conference.
Please refer to requirements.txt
to build the environment. The current version of code supports the experiments on 8*GPUs.
To try on ENVISIONS, please use the bash script run_self_training.sh
or directly use the following command:
For agentic task MiniWob, please use:
python ENVISIONS/self_training_miniwob.py --base_model "llama2chat" --model_size "7B" --task_prefix "miniwob_llama2chat" --vllm_batchsize 1
For mathematical tasks, please use:
python ENVISIONS/self_training.py --base_model "llama2chat" --model_size "7B" --task_prefix "gsm_math_full_llama2chat" --vllm_batchsize 1
For logical reasoning tasks, please use:
python ENVISIONS/self_training_logic.py --base_model "llama2chat" --model_size "7B" --task_prefix "logic_llama2chat" --vllm_batchsize 1
*Note: paths to the base LLM are required to be replaced with your local path of the corresponding checkpoints.
- The LLM training is based on open-instruct and the generation steps are accelerated by vLLM.
- The environments are modified from Synapse and SeeClick for agentic tasks, PAL for mathemetical tasks, and Logic-LM for logical reasoning tasks.
If you find it helpful, please kindly cite our paper.
@article{xu2024interactive,
title={Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models},
author={Xu, Fangzhi and Sun, Qiushi and Cheng, Kanzhi and Liu, Jun and Qiao, Yu and Wu, Zhiyong},
journal={arXiv preprint arXiv:2406.11736},
year={2024}
}