-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support setting the trainer reference recursively for ensembles #13638
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it!
405a6d7
to
65f22ad
Compare
10a07fe
to
249f29e
Compare
78efd97
to
24eadb5
Compare
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #13638 +/- ##
=========================================
+ Coverage 49% 76% +28%
=========================================
Files 327 327
Lines 25492 25547 +55
=========================================
+ Hits 12452 19509 +7057
+ Misses 13040 6038 -7002 |
It seems this PR introduced a failing test, which somehow haven't failed before. The issue in the failing test is that the trainer instance is already garbage collected, which can happen with weakrefs. cc @carmocca |
Do you see it failing in different PRs? Do you suggest we remove the weakref, or that we ensure it doesn't get garbage collected in the test? |
I haven't noticed it in other PRs yet. Could be, that something we introduced in the last week somewhat made it trigger, but I don't know have an idea of what it could be. I think proper solution is to ensure it doesn't get garbage collected in the test, as I have a hard time imagining a situation in which a |
I'm just not sure what I should change to fix it. This variable should already hold the trainer reference: https://github.com/Lightning-AI/lightning/blob/0fb31ed614e73be903f9d8b339247bae24440566/tests/tests_pytorch/utilities/test_parsing.py#L67 |
Right, but that variable goes out of scope at the moment of returning from that function and is therefore free to get garbage-collected, since the only place where we have the reference is the weak reference. I think, what would help is to also return trainer instances from the |
What does this PR do?
Fixes #13146
Changes:
The same change was done to the Loop.trainer property for consistencyedit: it breaks the spawn queueDoes your PR introduce any breaking changes? If yes, please list them.
model.trainer
will now raise aRuntimeError
if it hasn't been set.Before submitting
PR review
cc @Borda @carmocca @justusschock @awaelchli @ananthsub @ninginthecloud @jjenniferdai @rohitgr7 @akihironitta