Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purge the pip cache after installing torch #143

Merged
merged 1 commit into from
Nov 1, 2023

Conversation

mdagost
Copy link

@mdagost mdagost commented Oct 31, 2023

The pytorch GPU container is quite large, and building off of it is causing timeouts on Databricks. See #142 . Purging the pip cache after torch install seems to save about 2 GB on the image size. Looks like this is done here in the venv image but not done after the torch installs. This PR fixes that.

@panchalhp-db panchalhp-db requested a review from ygong1 November 1, 2023 16:04
Copy link
Contributor

@panchalhp-db panchalhp-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this bug and helping fix it! We'll get this deployed to dockerhub soon.

@panchalhp-db panchalhp-db merged commit e992df9 into databricks:master Nov 1, 2023
@panchalhp-db
Copy link
Contributor

@mdagost the docker images on dockerhub have been updated with this change: https://hub.docker.com/layers/databricksruntime/gpu-pytorch/cuda11.8/images/sha256-4fd825414b78dc352602f427537cae086126ad1cf47edcf793ef196d2b958ff2 and the image size is down to 4.87 Gb. Thank you again for discovering the issue and helping fix it as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants