lamorel/examples/PPO_LoRA_finetuning at main · flowersteam/lamorel

History

Name		Name	Last commit message	Last commit date
parent directory ..
utils		utils
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
babyai_text_env.py		babyai_text_env.py
local_gpu_config.yaml		local_gpu_config.yaml
main.py		main.py
requirements.txt		requirements.txt

README.md

Context

We provide a lightweight implementation of the PPO finetuning performed in "Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning". We use LoRA through the Peft library for lightweight finetuning.

We leverage Lamorel's custom modules and updaters to add a value head on top of the LLM and finetune all the weights using the PPO loss. Finally, using Lamorel's initializer, we add LoRA's adapters to the LLM (which are then automatically synchronized by Lamorel if multiple LLM instances are deployed).

Installation

Install BabyAI-Text environment
Install required packages: pip install -r requirements.txt

Launch

To launch the example using a single GPU on a local machine:

Spawn both processes (RL collecting data and LLM):

python -m lamorel_launcher.launch \
       --config-path PROJECT_PATH/examples/PPO_finetuning/ \ 
       --config-name PROJECT_PATH/examples/PPO_finetuning/local_gpu_config \
       rl_script_args.path=PROJECT_PATH/examples/PPO_finetuning/main.py \
       rl_script_args.output_dir=YOUR_OUTPUT_DIR \

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPO_LoRA_finetuning

PPO_LoRA_finetuning

README.md

Context

Installation

Launch

Files

PPO_LoRA_finetuning

Directory actions

More options

Directory actions

More options

Latest commit

History

PPO_LoRA_finetuning

Folders and files

parent directory

README.md

Context

Installation

Launch