-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use base version when comparing torch versions #16657
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
c1b4b3c
use base version
awaelchli fb71804
use base version in other definitions too
awaelchli 157a76f
changelog
awaelchli 70465eb
Merge branch 'master' into bugfix/torch-version-comparision
awaelchli 2e1b840
add comment
awaelchli bffbcc4
changelog
awaelchli 65d73ba
Update src/lightning/pytorch/utilities/imports.py
carmocca 3a3e80a
Merge branch 'master' into bugfix/torch-version-comparision
carmocca 9fd0a22
Merge branch 'master' into bugfix/torch-version-comparision
awaelchli 47533e5
Merge branch 'master' into bugfix/torch-version-comparision
Borda 10ea001
Merge branch 'master' into bugfix/torch-version-comparision
awaelchli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change allows a different bug:
If we use an api added in
1.13.0
(final release)But the user has
1.13.0+a
Where
1.13.0+a
is an earlier version that doesnt include this apiThere will be an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I suggest that we don't do this and we just recommend upgrading torch instead. Meaning we don't support old nightly or pre-release versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user in the linked issue is using a standard docker image from nvidia: nvcr.io/nvidia/pytorch:22.10-py3
This means we won't support any of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess. I wonder why they use these PyTorch installations. One improvement we could do would be to warn the user about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using these images for the sake of repeatability in my experiments and because all the packages are
working out of the box (no need to manage conda/pip requirements, just run
docker run nvcr.io/nvidia/pytorch:22.10-py3 python my_script.py
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is related to officila Nvidia/PyTorch images, I would roll this change with
use_base_version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment above the changed lines with an explanation for the issue with a reference to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awaelchli This introduced a failing workflow in master (build-NGC) exactly because of this issue. The NGC 1.13 image installs a 1.13 release (
1.13.0a0+d0d6b1f
) that doesn't include a feature included in the true 1.13 release:https://github.com/Lightning-AI/lightning/actions/runs/4358626506/jobs/7619447910#step:3:1685
This will fail for anybody installing this specific image. I don't have any better suggestion than reverting this PR or skipping the workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carmocca Can I implement a fix that changes this condition here:
https://github.com/Lightning-AI/lightning/blob/accd2b9e61063ba3c683764043030545ed87c71f/src/lightning/pytorch/core/module.py#L1635-L1641
to use the base version check?
#17030
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. This method will go away in future releases anyways