Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Enable Intel XPU support #839

Merged
merged 34 commits into from
Oct 31, 2023
Merged

Conversation

abhilash1910
Copy link
Contributor

Thanks for creating this library. In our effort to streamline huggingface on Intel devices, this PR is an important step . With support already enabled in peft/accelerate and transformers this would be a good addition to run TRL out of the box on our devices.
cc @younesbelkada @pacman100

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your great efforts and enabling TRL for intel XPU devices!
My question is about backward compatbility with this PR, it seems you have removed the optimize_cuda_cache - that makes the library not BC with users that use that argument. Would it be possible to add a new attribute in PPOConfig optimize_device_cache that is going to be initialized with the same value as optimize_cuda_cache and raise a warning for users telling them that optimize_cuda_cache is going to be deprecated in the future?

@abhilash1910
Copy link
Contributor Author

Thanks @younesbelkada for the suggestion.Could you help re trigger CI and re-review? Thanks

@abhilash1910
Copy link
Contributor Author

@younesbelkada could you help review and rerigger CI ? I see the previous CI failure was arising due to mps . I tested the tests locally , seems to pass.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again! I would like @lvwerra to have a second look if possible

@younesbelkada younesbelkada requested a review from lvwerra October 10, 2023 08:09
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me. Have you had a chance to run the sentiment script with XPU to make sure everything works? Would you mind sharing the learning curves so we can compare to GPU runs?

trl/trainer/ppo_config.py Show resolved Hide resolved
@abhilash1910
Copy link
Contributor Author

abhilash1910 commented Oct 11, 2023

Generally looks good to me. Have you had a chance to run the sentiment script with XPU to make sure everything works? Would you mind sharing the learning curves so we can compare to GPU runs?

Yes we are running on all example scripts on our devices. I can share the loss curves for different tasks here ? Some of the perf is internal but we can share the functionality aspects .
@lvwerra @younesbelkada could you help trigger CI ? thanks

@lvwerra
Copy link
Member

lvwerra commented Oct 11, 2023

Could you for example run python examples/scripts/sentiment_tuning.py and post reward/kl curves? That's what we usually use for regression checks.

@abhilash1910
Copy link
Contributor Author

@lvwerra @younesbelkada I tried the dpo.py for sanity check on our GPUs for training , the wandb log: https://wandb.ai/abhilash-majumder/huggingface/runs/chj3jbgf?workspace=user-abhilash-majumder
I am in process of running other experiements and notebooks as well , but for sanity and initial functionality seems ok on our devices. Thanks

@younesbelkada
Copy link
Contributor

THanks very much for running the experiments @abhilash1910 ! Looking at the logs it seems the wandb is private :/ can you make the logs public so that we can have a look at them ? 🙏 Thanks!

@abhilash1910
Copy link
Contributor Author

THanks very much for running the experiments @abhilash1910 ! Looking at the logs it seems the wandb is private :/ can you make the logs public so that we can have a look at them ? 🙏 Thanks!

Oh apologies, it is public now, could you let me know if you are able to check? Thanks @younesbelkada

@abhilash1910
Copy link
Contributor Author

@younesbelkada could you please help re-trigger CI? thanks.

@abhilash1910
Copy link
Contributor Author

@younesbelkada @lvwerra if the changes make sense and internal slow tests pass, can this be merged ? Also is there a slack channel where I can further follow up regarding the benchmarking perf stats from our devices (if required) ? Thanks

@lvwerra
Copy link
Member

lvwerra commented Oct 13, 2023

Hi @abhilash1910, the PR is in a good shape to be merged IMO. If you could run python examples/scripts/sentiment_tuning.py that would be great since we have good comparisons for it and it should only take ~1h or so to see that it works. DPO is a bit newer, and we know less how regressions would look like.

@abhilash1910
Copy link
Contributor Author

Hi @lvwerra, there seems to be no "sentiment_tuning.py" script but I think there is "ppo.py" script which I tried out. Just for testing purposes: https://wandb.ai/abhilash-majumder/trl/runs/k0kletcb/overview?workspace=user-abhilash-majumder (let me know if you can access). It contains the stdout(prints) as I was making sure all results were correct. I am in process of taking a longer run. Thanks!

@lvwerra
Copy link
Member

lvwerra commented Oct 17, 2023

Oh yes, that's it - we recently renamed it. Looks like your W&B logs are private though :)

@abhilash1910
Copy link
Contributor Author

Oh I thought it was public, it is now. Could you also re-trigger the CI ? Thanks

@abhilash1910
Copy link
Contributor Author

@lvwerra @younesbelkada could you help re-trigger CI and re-review? If any further modules need to be supported from our side, it will be in a separate PR . Thanks

@lvwerra
Copy link
Member

lvwerra commented Oct 25, 2023

Could you merge main into your branch? We deactivated the flaky tests so now it should run through.

@lvwerra
Copy link
Member

lvwerra commented Oct 25, 2023

Looks good, if you can share the longer run, I think we can merge :)

@abhilash1910
Copy link
Contributor Author

abhilash1910 commented Oct 26, 2023

@lvwerra completed the ppo script with 194 iterations; I am sharing the logs here:
ppo_train.log
I did not use the wandb for this instance , but wanted a full run to check the timeline.
Please let me know if this can be added in this version release . Thanks for the support.

@lvwerra
Copy link
Member

lvwerra commented Oct 26, 2023

Hi @abhilash1910, did you still save the training logs? All we want to check is that PPO also converges at the same rate as on the GPU. E.g. the mean reward vs. iteration plot would be helpful. Thank you!

@abhilash1910
Copy link
Contributor Author

abhilash1910 commented Oct 26, 2023

@lvwerra Unfortunately I did not save the entire run log for wandb (for the previous run), I retook a snap again for 6 iters for the ppo mean & the returns:
image
It seems like it is converging .policy kl curve is something I am looking at :
image
which seems to be going as it should. Let me know if this seems ok ? Thanks

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks!

@lvwerra lvwerra merged commit ec9e766 into huggingface:main Oct 31, 2023
8 checks passed
@abhilash1910
Copy link
Contributor Author

Thanks @lvwerra , @younesbelkada for your continued support.

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024
* enable xpu support

* fix bug

* review commits

* fix style

* add xou decorator

* refactor review commit

* fix test

* review commit

* fix test

* Update benchmark.yml (huggingface#856)

* Standardise example scripts (huggingface#842)

* Standardise example scripts

* fix plotting script

* Rename run_xxx to xxx

* Fix doc

---------

Co-authored-by: Costa Huang <costa.huang@outlook.com>

* Fix version check in import_utils.py (huggingface#853)

* dont use get_peft_model if model is already peft (huggingface#857)

* merge conflict

* add xou decorator

* resolve

* resolves

* upstream

* refactor and precommit

* fix new tests

* add device mapping for xpu

---------

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: Costa Huang <costa.huang@outlook.com>
Co-authored-by: Adam Pauls <adpauls@gmail.com>
Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants