New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add toxcitiy example #162

Merged

younesbelkada merged 31 commits into huggingface:main from younesbelkada:toxicity-example-new

Feb 28, 2023

Contributor

younesbelkada commented Feb 17, 2023 •

edited

Loading

What does this PR do?

This PR adds the toxicity example and replaces #142

TODO:

update docs with tips


          add toxcitiy example

ad1a26f

HuggingFaceDocBuilderDev commented Feb 17, 2023 •

edited

Loading

The documentation is not available anymore as the PR was closed or merged.

younesbelkada mentioned this pull request

[WIP] Add toxicity example #142

Closed

younesbelkada added 5 commits

February 17, 2023 11:41


          more description + clean up

fe3d50b


          update toctree

767b5fb


          fix hlink

cf9204f


          update docs + clean up

dcdc888


          update docs

cc190c2

younesbelkada requested a review from lvwerra

February 19, 2023 10:27

younesbelkada added 6 commits

February 19, 2023 10:28


          rm unneeded file

3c11002


          Merge remote-tracking branch 'origin/master' into toxicity-example-new

0b1e684


          few fixes

fd7a4d5


          few fixes

0d8dcbc


          update docs

8fef8d6


          nits

2bb0560

lvwerra reviewed

View reviewed changes

Member

lvwerra left a comment

Hi @younesbelkada did a first pass over the text and looks good. Left a few comments and suggestions.

trl/trainer/ppo_config.py Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx

+              model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", torch_dtype=torch.bfloat16)
+              ```
+              - Use shared layers: Since PPO algorithm requires to have both the active and reference model to be on the same device, we have decided to use shared layers to reduce the memory footprint of the model. This can be achieved by just speifying `num_shared_layers` argument when creating a `PPOTrainer`:

Member

lvwerra Feb 22, 2023

Maybe add a sentence clarifying that this then means that we only train the last 4 layers of the model.

Contributor Author

younesbelkada Feb 22, 2023

Yes makes sense!

Contributor Author

younesbelkada Feb 22, 2023

Isn't the other way around? We don't train the first 4 layers and train the rest

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated

Comment on lines 164 to 166

+              We also think we could have trained the models on a "more toxic" dataset as the one we used is much cleaner than the dataset we used for testing our models (from our observation).
+              A hypothesis we made is that larger models tends to be more toxic. Therefore, one could have also played with the KL-penalty term, to allow the model to deviate a bit more from its original distribution. We also believe that fine-tuning a model that is known to be toxic (i.e. trained on a toxic dataset) could also lead to better results.
+              We have also observed that training the model with larger context helps getting better results for larger models. Therefore one could have also played with this factor for the larger model and produce a better model.

Member

lvwerra Feb 22, 2023

I feel like there is a lot of uncertainty in those statements. I would maybe go for something like:

In addition to human feedback this could be a useful additional signal when training large language models to ensure there outputs are less toxic as well as useful.

Contributor Author

younesbelkada Feb 22, 2023

Thanks for the feedback! Proposed something below

younesbelkada and others added 3 commits

February 22, 2023 11:06


          Apply suggestions from code review

519050c

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>


          revert uneeded change

5b89554


          Merge branch 'toxicity-example-new' of https://github.com/younesbelka…

c73c1f2

…da/trl into toxicity-example-new

younesbelkada commented

View reviewed changes

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

younesbelkada and others added 10 commits

February 22, 2023 11:13


          Update docs/source/detoxifying_a_lm.mdx

ba6f598

fix

b5f8690


          Merge branch 'toxicity-example-new' of https://github.com/younesbelka…

65dfe25

…da/trl into toxicity-example-new

fix

ba36a90

fix

fd3367c


          add eval script

1aff6e8

fix

fd30965


          add fixes

dd01369


          few fixes

452aa87


          add toxic examples

0aa67c7

younesbelkada commented

View reviewed changes

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved


          Update docs/source/detoxifying_a_lm.mdx

ead3e22

younesbelkada requested a review from lvwerra

February 22, 2023 11:00

Contributor Author

younesbelkada commented Feb 22, 2023

Thanks a mile for the extensive review @lvwerra ! Should have addressed them now and left few minor questions :)


          add link to spaces

d36293b

lvwerra approved these changes

View reviewed changes

Member

lvwerra left a comment

A few minor comments, then we can merge :)

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated

+              <img src="https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/trl-collapse-mode.png">
+              </div>
+              The final training run of `ybelkada/gpt-j-6b-detoxified-1000-20shdl` looks like this:

Member

lvwerra Feb 28, 2023

Do you need to update the model name?

Contributor Author

younesbelkada Feb 28, 2023

Nice catch, updated the model name on the Hub and here as well

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

docs/source/detoxifying_a_lm.mdx Outdated Show resolved Hide resolved

younesbelkada and others added 4 commits

February 28, 2023 15:09


          Apply suggestions from code review

c8238df

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>


          change name

69807bd


          Merge branch 'toxicity-example-new' of https://github.com/younesbelka…

b028b98

…da/trl into toxicity-example-new


          remove last paragraph

c40c250

younesbelkada merged commit 8ec80d5 into huggingface:main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet