-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added check to verify xla device is TPU #3274
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3274 +/- ##
=======================================
+ Coverage 84% 87% +4%
=======================================
Files 119 118 -1
Lines 9764 9184 -580
=======================================
- Hits 8169 8013 -156
+ Misses 1595 1171 -424 |
0cf6eaa
to
f52321a
Compare
Hello @lezwon! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-10-05 17:56:08 UTC |
a4fa477
to
f315261
Compare
@Borda @justusschock @awaelchli I need some help fixing the windows tests. Has to do something with pickling. |
@lezwon I think this is a common subprocess thing, when you launch it, it needs to import everything first before it runs it, and so we have to guard the call from the import, i.e., you would have to add def tpu_device_exists():
if xm is not None:
device = xm.xla_device()
device_type = fetch_xla_device_type(device)
return device_type == "TPU"
else:
return False
if __name__ == "__main__":
TPU_AVAILABLE = pl_multi_process(tpu_device_exists)()
print(TPU_AVAILABLE) but it's obviously not how we want it. My suggestions is to move towards a functional check, like I propose here #2877. Then, we shouldn't get the pickle error since the value is only computed at runtime. |
dabcf7f
to
17867e2
Compare
hwy @lezwon mind taking a look at this again? we would love to have this issue resolved |
@edenafek will continue on it by the weekend. |
@lezwon mind resolve conflicts so we know what is the actual state... |
ef11533
to
8460e0d
Compare
@lezwon do you still need help with this? is there a problem with windows? |
hey, I got it working with a minor workaround. Not sure about the cause of that issue though :) |
is it ready for review @lezwon ? |
yep. it is |
queue.put(None) | ||
|
||
|
||
def pl_multi_process(func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a similar, if not identical function in tests/base/develop_utils.py
do we need both? I cannot see an obvious difference besides the inner_f being defined outside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically used the other function itself as reference. the one in develop_utils
is just for tests right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but couldn't you delete the one from tests and then import from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pl_multi_process_test
varies a bit compared to this function. It has an assert
statement within and returns either 1
or -1
for the test. This one is meant to return the device type on None
.
0ceeb30
to
051df9b
Compare
Thank you :) It looks good to me 👍 |
This pull request is now in conflict... :( |
This pull request is now in conflict... :( |
0a1a435
to
3a71c87
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request is now in conflict... :( |
# Conflicts: # CHANGELOG.md # pytorch_lightning/accelerators/tpu_backend.py # pytorch_lightning/trainer/data_loading.py # tests/models/test_tpu.py
3a71c87
to
7732fc5
Compare
This pull request is now in conflict... :( |
What does this PR do?
Fixes #3104
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃