-
Notifications
You must be signed in to change notification settings - Fork 26.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Device agnostic testing #25870
Device agnostic testing #25870
Conversation
Thanks, I will take a look @vvvm23 |
@vvvm23 Could you pull the latest main to your local clone and rebase your PR branch on top of your local |
bloom, opt, and reformer for now
if present, imports the target file and updates device to function mappings
dc6c9e7
to
c961bd6
Compare
I rebased a few days ago, but realised I forgot to ping! Sorry @ydshieh! |
No problem, I will take a look this week 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Thank you for opening this PR. Overall it's good! I left a few nint comments.
I will have to check the failing tests though.
Have you tried to run the tests with a 3rd party device (with this PR of course)?
@vvvm23 well noted. Thank you for the contribution, and best wishes for your next adventure! (I was just back from a break, will take a look and pin @arampacha for necessary changes if any) |
Hi, as Alex mentioned, I’ll be taking over this PR. Just wanted to check in on the status of the review, please let me know when you can if there are any comments/further changes you’d like made 🙂 |
Hi @arsalanu Toward the end of For example, instead of If we check And those methods are actually method for torch backends. WDYT? |
Other than this, it looks good to me. It would be great to see an example run on Finally, we should update the documentation here transformers/docs/source/en/testing.md Line 514 in b5ca8fc
In this PR, we don't have to apply all the new things like |
Hi, I've updated the PR to rename the functions, I agree that |
Thanks, @statelesshz - I've made your changes to this PR. @ydshieh when you can, could you please change this PR from a 'draft' to ready-for-review if you feel it is ready to be approved? (I don't have the access to do this) Thank you! |
Sure. BTW, could you run
|
Looks like they're all passing now 👍 |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Hi @LysandreJik This PR is ready for a review from core maintainers 🙏 . It enables the testing on different type of accelerators. The only thing I see that might be a bit inconvenience is the file If this is the case, we can move the actual definitions to a new file and import them from there to transformers/src/transformers/testing_utils.py Lines 2201 to 2281 in a1230d4
|
looking forward to this PR being accepted asap 🤗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fxmarty, given your work with other accelerators, do you have any feedback on this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -291,7 +292,7 @@ def test_generate_fp16(self): | |||
input_ids = input_dict["input_ids"] | |||
attention_mask = input_ids.ne(1).to(torch_device) | |||
model = OPTForCausalLM(config).eval().to(torch_device) | |||
if torch_device == "cuda": | |||
if is_torch_fp16_available_on_device(torch_device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should probably just be skipped if the device does not support fp16 (alternatively adding @require_torch_fp16
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @arsalanu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After fixing @fxmarty's feedback, LGTM & merge ahead
Thank you @LysandreJik, @fxmarty for the look-over. I'm just waiting for the tests to pass and then will merge. |
@ydshieh (or @vvvm23!) I don't have the access to merge this. It looks ready to go, could one of you click the button? Reminder to include @statelesshz as a contributer :) |
I don't have the ability sadly 😅 |
Thank you again @vvvm23 @arsalanu and @statelesshz (and especially reminding me to add @statelesshz as coauthor!) |
|
My bad 😭 sorry. But I think GitHub counts with the email, so @statelesshz still appears |
* adds agnostic decorators and availability fns * renaming decorators and fixing imports * updating some representative example tests bloom, opt, and reformer for now * wip device agnostic functions * lru cache to device checking functions * adds `TRANSFORMERS_TEST_DEVICE_SPEC` if present, imports the target file and updates device to function mappings * comments `TRANSFORMERS_TEST_DEVICE_SPEC` code * extra checks on device name * `make style; make quality` * updates default functions for agnostic calls * applies suggestions from review * adds `is_torch_available` guard * Add spec file to docs, rename function dispatch names to backend_* * add backend import to docs example for spec file * change instances of to * Move register backend to before device check as per @statelesshz changes * make style * make opt test require fp16 to run --------- Co-authored-by: arsalanu <arsalanu@graphcore.ai> Co-authored-by: arsalanu <hzji210@gmail.com>
* adds agnostic decorators and availability fns * renaming decorators and fixing imports * updating some representative example tests bloom, opt, and reformer for now * wip device agnostic functions * lru cache to device checking functions * adds `TRANSFORMERS_TEST_DEVICE_SPEC` if present, imports the target file and updates device to function mappings * comments `TRANSFORMERS_TEST_DEVICE_SPEC` code * extra checks on device name * `make style; make quality` * updates default functions for agnostic calls * applies suggestions from review * adds `is_torch_available` guard * Add spec file to docs, rename function dispatch names to backend_* * add backend import to docs example for spec file * change instances of to * Move register backend to before device check as per @statelesshz changes * make style * make opt test require fp16 to run --------- Co-authored-by: arsalanu <arsalanu@graphcore.ai> Co-authored-by: arsalanu <hzji210@gmail.com>
* adds agnostic decorators and availability fns * renaming decorators and fixing imports * updating some representative example tests bloom, opt, and reformer for now * wip device agnostic functions * lru cache to device checking functions * adds `TRANSFORMERS_TEST_DEVICE_SPEC` if present, imports the target file and updates device to function mappings * comments `TRANSFORMERS_TEST_DEVICE_SPEC` code * extra checks on device name * `make style; make quality` * updates default functions for agnostic calls * applies suggestions from review * adds `is_torch_available` guard * Add spec file to docs, rename function dispatch names to backend_* * add backend import to docs example for spec file * change instances of to * Move register backend to before device check as per @statelesshz changes * make style * make opt test require fp16 to run --------- Co-authored-by: arsalanu <arsalanu@graphcore.ai> Co-authored-by: arsalanu <hzji210@gmail.com>
What does this PR do?
Adds extra capabilities to
testing_utils.py
to support testing on devices besides fromcuda
andcpu
without having to upstream device-specific changes.This involves introducing some device agnostic functions that dispatch to specific backend functions. Users can specify new backends and backends for device agnostic functions via creating a device specification file and pointing the test suite to it using
TRANSFORMERS_TEST_DEVICE_SPEC
.An example specification for a hypothetical CUDA device without support for
torch.cuda.empty_cache
could look like this:By default, we have
cpu
andcuda
backends available, so not to affect default behaviour.We also introduce a new decorator
@require_torch_accelerator
which can be used to specify that a test needs an accelerator (but not necessarily a CUDA one).Crucially, these changes should not change the behaviour of upstream CI runners. They aim to be as non-intrusive as possible and do not break compatibility with tests before these changes are made.
In this PR, only a subset of all tests are updated to support these new features at first. These are:
test_modeling_bloom
– demonstrating usage of new@require_torch_accelerator
test_modeling_codegen
– demonstrating usage of device agnostic function (accelerator_manual-seed
)test_modeling_opt
– demonstrating another device agnostic function, this time to check whether the current device supportstorch.float16
test_modeling_reformer
– decorator version of the above.Related #25654
TODO:
TRANSFORMERS_TEST_DEVICE_SPEC
(once we finalise the PR)Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ydshieh