Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically get device from first model parameter if device parameter to summary is None #211

Merged
merged 2 commits into from
Jan 13, 2023

Conversation

snimu
Copy link
Contributor

@snimu snimu commented Jan 11, 2023

Issue #209 can be manually avoided by setting the device parameter of settings. This pull-request makes it so that if device is None, the device is extracted from the model (by looking at its first parameter) and, if that parameter is on cuda, using it as device (if it isn't, or the model has no parameters, previous behavior is maintained). This way, the issue is solved.

Ran pre-commit and everything passed, including the new unittests for multi-GPU.

This is my first contribution to torchinfo and I'm not super experienced; if there are any problems with this pull-request, I'd love some feedback :)

snimu added 2 commits January 11, 2023 09:17
- use it to determine device if no device is given to summary
- Automatically determines device of model so that it does not have to be moved in case of multi-GPU-setup.
- shoud fix [issue #209](#209). Test to check that to be written on different device. This commit is to set up the functionality without breaking old tests.
@codecov
Copy link

codecov bot commented Jan 11, 2023

Codecov Report

Merging #211 (ceabd65) into main (b4d80c0) will decrease coverage by 0.13%.
The diff coverage is 88.88%.

@@            Coverage Diff             @@
##             main     #211      +/-   ##
==========================================
- Coverage   97.52%   97.39%   -0.14%     
==========================================
  Files           6        6              
  Lines         607      615       +8     
==========================================
+ Hits          592      599       +7     
- Misses         15       16       +1     
Impacted Files Coverage Δ
torchinfo/torchinfo.py 97.02% <88.88%> (-0.34%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@TylerYep TylerYep changed the title Fix [issue #209](https://github.com/TylerYep/torchinfo/issues/209): automatically find out device of model if device parameter to summary is None Automatically get device from first model parameter if device parameter to summary is None Jan 11, 2023
@TylerYep TylerYep linked an issue Jan 11, 2023 that may be closed by this pull request
@TylerYep
Copy link
Owner

Thank you for fixing this! It looks like an improvement for auto-detecting the device in most cases.

@TylerYep TylerYep merged commit 8305e1f into TylerYep:main Jan 13, 2023
@snimu
Copy link
Contributor Author

snimu commented Jan 13, 2023

I'm happy to help! This was fun :)
I do want to share some thoughts about the pr, though:

So far, I've opted for the simplest and most efficient solution, just take the model, look at its first parameter, and pick the device. (Option 1)

A better but less efficient approach would have been to loop over all of the model's parameters and find their device, and get the device of input_data if it is not None. Then, the devices could be compared. If they're all the same device, I would just do as in this pull-request: select the device if it is on "cuda", else try to pick "cuda", and fallback to "cpu" otherwise. If the devices are not all the same, an exception might be raised. (Option 2)

A third option would be to do as in Option 2, but instead of raising an exception if the devices differ, keep the model distributed (and handle input_data appropriately) if it is on different GPUs. This would allow torchinfo to work for very large, distributed models, but I believe that it would require a medium- to large-sized overhaul of the code-base. (Option 3)

I could implement Option 2 if you think that it makes more sense than the current approach; I could also use the opportunity to add the .out-files in tests/test_output that I forgot in this pull-request ;) (the tests have passed upon pull-request because they are multi-GPU only and therefore weren't executed; I've run them through, though, and they already work and produce plausible output)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Models on cuda are always moved to GPU 0 by torchinfo.summary
2 participants