-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uploading images #24
Comments
@weiji14, let me know if you get a chance to test them.
note
missing packages from these images are here: #21. I haven't had a chance to run any benchmarks yet, but I will looking into that soon... |
Cool, thanks @ngam, I'll try and give this a spin on my GPU over the weekend. Is there a good benchmark you'd recommend to test this on? Preferrably something light that takes <16GB of GPU RAM. |
Sorry I didn't respond here... Not really sure about benchmarks, I usually really only run my own models and usually in tensorflow let me know if you managed to get something going |
Ok, found an easy-ish benchmark script at https://github.com/cresset-template/cresset/blob/7762a947ff567003befbab3d217364f9fcf98b67/benchmark.py. To run it, do: git clone https://github.com/cresset-template/cresset.git
cd cresset/ Below are the tests I ran on an NVIDIA RTX A5000 Laptop GPU, only thing I changed was the docker image ( NGC-based
|
Yes, but I'm glad it is only minor! I think what we can do is try harder to push the conda-forge feedstocks to copy the NGC builds... I'm already doing that with tensorflow |
Yeah, but like you said, those tiny differences might add up. Say if someone was training a neural network for 1 hour, 10sec saved per minute would mean 10x60 = 600 seconds or 10 minutes less time per hour. If you expand that to 1 day/24 hours, then that's 240 minutes or 4 hours saved! If you can pin the |
this weird pin is from NGC... https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel_22-04.html#rel_22-04 |
Interesting, so they are pinning specific Pytorch commits?!! I'm usually ok with bleeding edge software, but not sure if this is ok for general Pangeo users 😅 |
You're absolutely right on this btw. Also, take into account two additional points: 1) toy models are double edged swords, they're somewhat optimized which relatively light. I suspect for an actual researcher who ends up paying close attention to performance, the saved time will be a little more. So, I don't want to discount this premise, it is very important --- this is what drove me to do this to begin with :) |
I will try to upload some images later this week. We can at least document the process for interested community members if they have access to V100 or A100 GPUs and want some more performance!
Originally posted by @ngam in pangeo-data/pangeo-docker-images#320 (comment)
The text was updated successfully, but these errors were encountered: