Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval instructions #10

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
20 changes: 5 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# **H**um**a**n-Centered **Lo**ss Functions (HALOs) :innocent:
# Human-Aware Loss Functions (HALOs) :innocent:

This repo allows you to design new **HumAn-centered LOss functions (HALOs)** for aligning LLMs with offline human feedback at scale [(read more in our technical report)](assets/report.pdf).
This repo allows you to design new **Human-Aware Loss Functions (HALOs)** for aligning LLMs with offline human feedback at scale (read more in our [technical report](assets/report.pdf) or our [full paper](assets/full_paper.pdf)).
It was used to create Archangel, the largest-ever suite of human-feedback-aligned LLMs, and has been tested at scales from 1B to 30B.

This repo draws from the excellently written [DPO repo](https://github.com/eric-mitchell/direct-preference-optimization) and has preserved many design choices from the original.
Expand Down Expand Up @@ -45,7 +45,7 @@ What should we do?
5. Write a trainer in `trainers.py`. This should subclass either `UnpairedPreferenceTrainer` or `PairedPreferenceTrainer` depending on whether it uses pairs of preferences or not.
If you need highly custom behavior that is not in either, then you can subclass `BasicTrainer` directly.

We can implement a simple version of KTO as follows (not that this is different from the proper version of KTO in `KTOTrainer`, which does not assume the existence of both chosen and rejected examples in each batch).
We can implement a simple version of KTO as follows (note that this is different from the proper version of KTO in `KTOTrainer`, which does not assume the existence of both chosen and rejected examples in each batch).

To make SimpleKTOTrainer, we just subclass `trainers.UnpairedPreferenceTrainer` as `trainers.SimpleKTOTrainer` and overwrite the loss function definition. KTO has one hyperparameter, beta, which we can access via `self.config.loss.beta`:

Expand All @@ -65,17 +65,7 @@ What should we do?
If generation y ~ p_rejected, , where x' ~ are the examples with chosen generations, we have the 'rejected' loss:
L(x, y) := 1 - sigmoid(beta * KL(p_policy(y_chosen|x') || p_reference(y_chosen|x')) - [log p_policy(y|x) - log p_reference(y|x)])
"""
chosen_KL = (policy_chosen_logps - reference_chosen_logps).mean().clamp(min=0)
rejected_KL = (policy_rejected_logps - reference_rejected_logps).mean().clamp(min=0)

chosen_logratios = (policy_chosen_logps - reference_chosen_logps)
rejected_logratios = (policy_rejected_logps - reference_rejected_logps)

losses = torch.cat((1 - F.sigmoid(self.config.loss.beta * (chosen_logratios - rejected_KL)), 1 - F.sigmoid(self.config.loss.beta * (chosen_KL - rejected_logratios))), 0)

chosen_rewards = self.config.loss.beta * (policy_chosen_logps - reference_chosen_logps).detach()
rejected_rewards = self.config.loss.beta * (policy_rejected_logps - reference_rejected_logps).detach()

# your implementation goes here
return losses, chosen_rewards, rejected_rewards
```

Expand Down Expand Up @@ -145,7 +135,7 @@ If you find this repo or the technical paper useful in your research, please fee
```
@techreport{ethayarajh2023halos,
author = {Ethayarajh, Kawin and Xu, Winnie, and Jurafsky, Dan and Kiela, Douwe},
title = {Human-Centered Loss Functions (HALOs)},
title = {Human-Aware Loss Functions (HALOs)},
institution = {Contextual AI},
note = {https://github.com/ContextualAI/HALOs/blob/main/assets/report.pdf},
year = {2023},
Expand Down
Binary file added assets/full_paper.pdf
Binary file not shown.
Binary file modified assets/report.pdf
Binary file not shown.
6 changes: 6 additions & 0 deletions config/model/mistral7b_sft_beta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
defaults:
- base_model

name_or_path: HuggingFaceH4/mistral-7b-sft-beta
block_name: MistralDecoderLayer
use_flash_attention: true
72 changes: 72 additions & 0 deletions scripts/eval.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#!/bin/bash
CKPT=$1
# Make out be the last string after /
OUT=results/${CKPT##*/}
mkdir -p $OUT

DATADIR=evaldata/eval
FORMAT=eval.templates.create_prompt_with_halo_chat_format

pwd=$(pwd)
cd open-instruct

python -m eval.gsm.run_eval \
--data_dir $DATADIR/gsm/ \
--max_num_examples 200 \
--save_dir $OUT \
--model $CKPT \
--tokenizer_name_or_path $CKPT \
--n_shot 8 \
--use_chat_format \
--use_vllm \
--chat_formatting_function $FORMAT

python -m eval.mmlu.run_eval \
--ntrain 0 \
--data_dir $DATADIR/mmlu/ \
--save_dir $OUT \
--model_name_or_path $CKPT \
--tokenizer_name_or_path $CKPT \
--eval_batch_size 4 \
--use_chat_format \
--chat_formatting_function $FORMAT

python -m eval.bbh.run_eval \
--data_dir $DATADIR/bbh \
--save_dir $OUT \
--model $CKPT \
--tokenizer_name_or_path $CKPT \
--max_num_examples_per_task 40 \
--use_chat_format \
--use_vllm \
--chat_formatting_function $FORMAT

python -m eval.tydiqa.run_eval \
--data_dir $DATADIR/tydiqa \
--n_shot 1 \
--max_num_examples_per_lang 100 \
--max_context_length 512 \
--save_dir $OUT \
--model $CKPT \
--tokenizer_name_or_path $CKPT \
--use_chat_format \
--use_vllm \
--chat_formatting_function $FORMAT

cd ../bigcode-evaluation-harness

accelerate launch --config_file /home/niklas/sgpt2/scripts/configs/config_1gpu_vanilla.yml main.py \
--model $CKPT \
--tasks humanevalsynthesize-python \
--do_sample True \
--temperature 0.2 \
--n_samples 20 \
--batch_size 5 \
--allow_code_execution \
--save_generations \
--trust_remote_code \
--prompt halo \
--save_generations_path $OUT/generations_humanevalsynthesizepython.json \
--metric_output_path $OUT/evaluation_humanevalsynthesizepython.json \
--max_length_generation 2048 \
--precision bf16
19 changes: 19 additions & 0 deletions scripts/reformat_statedict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import os
import sys
import torch

path = sys.argv[1]

sd_path = os.path.join(path, "policy.pt")

sd = torch.load(sd_path)
# Check if already reformatted by checking if first key has model. prefix
if not "state" in list(sd.keys()):
print('SD seems already reformatted: ', sd.keys())
sys.exit(0)
torch.save(sd["state"], sd_path)

# Copy in tokenizer etc
os.system(f"mv {sd_path} {os.path.join(path, 'pytorch_model.bin')}")
os.system(f"cp -r /data/niklas/kto_mistralsft/*tok* {path}/")
os.system(f"cp -r /data/niklas/kto_mistralsft/*json* {path}/")