🃏 Model card for TRL #2123

qgallouedec · 2024-09-25T16:45:58Z

What does this PR do?

Having our own model card.

Demo

from datasets import load_dataset
from trl import DPOConfig, DPOTrainer
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B-Instruct")
dataset_name = "trl-lib/tldr-preference"
train_dataset = load_dataset(dataset_name, split="train").select(range(50))
args = DPOConfig(output_dir="dpo-qwen2", logging_steps=10, push_to_hub=True, save_strategy="epoch", num_train_epochs=1)
trainer = DPOTrainer(model=model, args=args, tokenizer=tokenizer, train_dataset=train_dataset)
trainer.train()

if args.push_to_hub:
    trainer.push_to_hub(dataset_name=dataset_name)

result: https://huggingface.co/qgallouedec/dpo-qwen2

It adds

Link to the paper

Link to the dataset

TRL own model card

Other

Citations
Links to wandb run

TODO

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2024-09-25T16:51:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

lewtun

Amazing work on the model cards @qgallouedec - now they're really packed with useful information 🔥 !!

LGTM with a tweak to the example inference code

examples/scripts/alignprop.py

lewtun · 2024-09-27T07:05:18Z

examples/scripts/rloo/rloo.py

@@ -133,6 +133,6 @@ def tokenize(element):
    # Save and push to hub
    trainer.save_model(training_args.output_dir)
    if training_args.push_to_hub:
-        trainer.push_to_hub()
+        trainer.push_to_hub(dataset_name="trl-internal-testing/descriptiveness-sentiment-trl-style")


Note to self: we should move these datasets that aren't strictly used for tests to trl-lib

tests/test_dpo_trainer.py

trl/templates/model_card.md

qgallouedec · 2024-09-27T09:56:29Z

Note that the code demo for diffusion models and VLM will be wrong but we can probably keep it like that for now

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

valayDave · 2024-10-16T22:16:30Z

When will this commit be released ? Will it be a part of v0.12 ?

qgallouedec added 4 commits September 25, 2024 16:34

template and util

5a5f2c5

test for online dpo

5269a21

template in package_data

01c2e77

template in manifest

ca73c40

qgallouedec mentioned this pull request Sep 25, 2024

Standardize pushing to Hub in examples #2126

Merged

5 tasks

qgallouedec and others added 24 commits September 25, 2024 21:44

standardize push_to_hub

643b9bc

wandb badge and quick start

41fe4bb

Merge branch 'main' into model_card

db50cfb

Merge branch 'main' into model_card

198f4cb

bco

9f2ed4c

xpo

4afcd90

simplify create_model_card

5cf9320

cpo

61c912a

kto

6d33273

dpo

9e67eb5

gkd

5c32249

Merge branch 'main' into model_card

86480e7

orpo

635ec60

style

475abda

nash-md

fe399f1

alignprop

4a40455

bco citation

4d72f0f

citation template

a0d81eb

cpo citation

8d37399

ddpo

3de7b09

fix alignprop

6bf11cd

dpo

0ec6522

gkd citation

f0d4c83

kto

22dec89

qgallouedec and others added 4 commits September 26, 2024 14:40

Add dataset name to push_to_hub() call

53d9962

Update trainer.push_to_hub() dataset names

9a0aec0

Merge branch 'main' into model_card

d83f32e

script args

164037c

qgallouedec requested review from kashif and lewtun September 26, 2024 14:59

qgallouedec marked this pull request as ready for review September 26, 2024 14:59

qgallouedec requested a review from edbeeching September 26, 2024 15:01

qgallouedec and others added 8 commits September 26, 2024 15:20

test

9de0288

better doc

5c84760

fix tag test

f49386a

fix test tag

d4de482

Add tags parameter to create_model_card method

254fcbc

doc

09d16e3

Merge branch 'main' into model_card

40ff943

script args

6e71659

lewtun approved these changes Sep 27, 2024

View reviewed changes

Merge branch 'main' into model_card

5bec9c2

qgallouedec and others added 3 commits September 27, 2024 12:09

Update trl/templates/model_card.md

d9973c0

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

unittest's assertIn instead of assert

65af404

Update trl/templates/model_card.md

9a08ebd

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

kashif approved these changes Sep 27, 2024

View reviewed changes

Rename model_card to lm_model_card

5080ff5

qgallouedec merged commit c00722c into main Sep 27, 2024
3 checks passed

qgallouedec deleted the model_card branch September 27, 2024 13:23

lewtun mentioned this pull request Sep 30, 2024

🐾 Process-supervised RM Trainer #2127

Merged

5 tasks

This was referenced Oct 4, 2024

🃏 Model card: "unsloth" tag #2173

Merged

Trainer.push_to_hub() with PEFT doesn't work when the base model is loaded from local disk huggingface/transformers#33922

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🃏 Model card for TRL #2123

🃏 Model card for TRL #2123

qgallouedec commented Sep 25, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 25, 2024

lewtun left a comment

lewtun Sep 27, 2024

qgallouedec commented Sep 27, 2024 •

edited

Loading

valayDave commented Oct 16, 2024

🃏 Model card for TRL #2123

🃏 Model card for TRL #2123

Conversation

qgallouedec commented Sep 25, 2024 • edited Loading

What does this PR do?

Demo

It adds

Link to the paper

Link to the dataset

TRL own model card

Other

TODO

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Sep 25, 2024

lewtun left a comment

Choose a reason for hiding this comment

lewtun Sep 27, 2024

Choose a reason for hiding this comment

qgallouedec commented Sep 27, 2024 • edited Loading

valayDave commented Oct 16, 2024

qgallouedec commented Sep 25, 2024 •

edited

Loading

qgallouedec commented Sep 27, 2024 •

edited

Loading