Training open-source LLMs using RLHF

This project focuses on the fine-tuning of large language models via Reinforcement Learning with Human Feedback (RLHF). Our primary objective is to enhance Google's Text to Text Transfer Transformer (T5) model using the OpenHermesPreferences dataset. For the optimization process, we employ Proximal Policy Optimization (PPO) to refine the model's performance in generating text that aligns more closely with human preferences and values. We used PairRM as the reward model.

Ref: https://huggingface.co/blog/rlhf

Setup

Clone this repository:

git clone https://github.com/gtamer2/rl_final_project.git

Install the dependencies:
```
pip install -r requirements.txt
```

Model Training

Execute the training script with the following command:

python main.py --model_name="google-t5/t5-small" --batch_size=32 --epochs=200 --mode="train"

Parameters

batch_size: Batch size for training.
epochs: Number of training epochs.
model_name: LLM Model.
lr: Learning rate for the optimizer.
model_save_path: Path to save the trained model.
rewards_save_path: Path to save the rewards.
dataset_size: Number of data samples (Use -1 to train on the entire dataset)
seed

Model Prediction

Execute the prediction script with the following command:

python main.py --model_name="my_ppo_model" --batch_size=32 --mode="predict"

Parameters

batch_size: Batch size for training.
model_name: LLM Model.
dataset_size: Number of data samples (Use -1 to generate predictions for the entire test set)

Visualize Reward Curve

To visualize the reward curve, use the following command:

python main.py --rewards_save_path="reward.npy" --mode="visualize"

Results

Model	Avg. Reward	Avg. BLEU Score	Avg. BERT Score
T5 Original	-11.2550	0.0024	0.0273
T5 with RLHF	-4.7752	0.0143	0.0339

Evaluation Curves

Average Reward Curve:

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
figures		figures
src/modules		src/modules
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training open-source LLMs using RLHF

Setup

Model Training

Parameters

Model Prediction

Parameters

Visualize Reward Curve

Results

Evaluation Curves

Team Members

About

Releases

Packages

Contributors 2

Languages

kushaangowda/rlhf_for_llm

Folders and files

Latest commit

History

Repository files navigation

Training open-source LLMs using RLHF

Setup

Model Training

Parameters

Model Prediction

Parameters

Visualize Reward Curve

Results

Evaluation Curves

Team Members

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages