-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deactivate caching in fulltest pipeline #294
Conversation
@@ -1,7 +1,7 @@ | |||
# NOTES: | |||
# - The map syntax used for matrix is flagged red but actually works | |||
# - This runs everything in Python 3.10, 3.11 and 3.12 | |||
# - No environments are cached due to space limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the old reason is in principle still true so why delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that's not the primary reason why we don't want caching here, right? I thought we explicitly want to trigger to full installation pipeline etc (including newest package versions etc) in order to have a true e2e test? While the old reason might still be valid, it doesn't matter here. Or how would you even want me to express that? I guess not like "... to perform a full end-to-end test. Even if we didn't care about e2e, environments would still not be cached due to space limits." 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it was certainly one of the reasons and still is, if this action creates lots of caches it might delete caches form CI because theres a limit, just keep it and add your reason, no reason to delete
closing due to #298 |
Due to continuing serialization problems that were thought to be related with caching, #277 deactivated core test caching and #294 was prepared to do the same for the full test environment. This PR reactivates caching and instead refactors the class layout of `SKLearnClusteringRecommender` in an attempt to fix the root cause. Mysteriously, the top-level import of `sklearn.mixture.GaussianMixture` seems to cause trouble. While the reason is still unclear, turning it into a lazy import (which will also become handy later when making `scikit-learn` an optional dependency) seems to resolve the problem. On a side note: deactivating slots for the recommenders solves the problem as well, which suggests that the root cause could be related to classes not being properly garbage collected (since `attrs` needs to create new classes when slots are activated), which could also explain that `GaussianMixtureClusteringRecommender` seemed to have improperly overridden methods after deserialization (for example, the `__repr__` of a created Gaussian mixture recommender correctly pointed to its own class before serialization but to the `__repr__` of `SKLearnClusteringRecommender` after serialization – but weirdly only when executed in `tox`).
This PR removes the last remaining caching steps from our pipeline due to the mysterious interference with serialization. Can be reactivated once the root cause is identified.