Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLM training example #248

Merged
merged 9 commits into from
Jul 13, 2022

Conversation

carzh
Copy link
Contributor

@carzh carzh commented Jun 30, 2022

What does this PR do?

  • adds CLM training example + brief README related to it

Built off of work from the transformers/examples repo.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@JingyaHuang
Copy link
Collaborator

JingyaHuang commented Jul 1, 2022

Hi @carzh, thanks for adding the training example!

Can you reformat the changed files by make style, in the meantime, I will work on the review, thanks!

Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @carzh, thanks for the example!

I left some small nits in the example. For the PyTorch baseline, are ort folks planning to use it for benchmarking purposes? Since we prefer to just use ORTTrainer as it is an ort training example.

P.S. For the code quality check, can you do pip install ".[quality]" before make style to ensure that you are using the correct version of black and isort? This should help the scripts pass the quality check, thx!

@JingyaHuang JingyaHuang mentioned this pull request Jul 4, 2022
@carzh
Copy link
Contributor Author

carzh commented Jul 6, 2022

Hi @JingyaHuang thank you so much for taking a look at the PR & adding the helpful comments!

To summarize some of the changes / comments responses:

  • ORT team was originally planning on using it for benchmarking purposes, but we removed the ORT flag and will use your suggestion of comparing with transformers/examples for benchmarking.
  • updated the requirements file according to your suggestions.
  • didn't add preprocess_logits_for_metrics argument back in because ORTTrainer doesn't have that argument.

protobuf > 3.21.x will break the training.
Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @carzh, thanks for iterating on this, LGTM!

For information, currently, the script will only work with the latest transformers version as demanded by ORTModels dependencies, thus gpt2 mixed-precision training will fail in this case. To solve the issue, we are working on:

  • transformers PR #18017 - Fix gpt2 fp16 training which breaks after transformers>4.16.0 (with the fix users can do gpt2 fp16 training with the latest transformers)
  • an upcoming PR Associated import lead to import fails #265 to decouple the import of different APIs (with the fix users can do gpt2 fp16 training with transformers 4.16.0)

@carzh
Copy link
Contributor Author

carzh commented Jul 12, 2022

Hello, just wanted to confirm -- this PR is on standby until the GPT2 fp16 issue is resolved? Or can it be merged in sooner and is pending other checks? Thanks & sorry for any inconvenience!

@JingyaHuang
Copy link
Collaborator

Hello, just wanted to confirm -- this PR is on standby until the GPT2 fp16 issue is resolved? Or can it be merged in sooner and is pending other checks? Thanks & sorry for any inconvenience!

Hi @carzh, actually there are two possible modifications to the clm example:

  • The fix of gpt2(the PR is waiting for the approval of transformers' core maintainers) -> need to change the minimum transformers version in the example.
  • In the Add ORT fused adam optimizer #295, new ORT training arguments were added which enable the use of ort fused adam optimizer -> need to replace training args by ort training args.

But I think that it would be better to add these modifications after these two PRs make their way to the main branch. I will merge this PR first!

@JingyaHuang JingyaHuang merged commit 7c1a621 into huggingface:main Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants