Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deactivate caching in fulltest pipeline #294

Closed
wants to merge 1 commit into from
Closed

Conversation

AdrianSosic
Copy link
Collaborator

This PR removes the last remaining caching steps from our pipeline due to the mysterious interference with serialization. Can be reactivated once the root cause is identified.

@AdrianSosic AdrianSosic added the repo Requires changes to the project configuration label Jul 2, 2024
@AdrianSosic AdrianSosic self-assigned this Jul 2, 2024
@@ -1,7 +1,7 @@
# NOTES:
# - The map syntax used for matrix is flagged red but actually works
# - This runs everything in Python 3.10, 3.11 and 3.12
# - No environments are cached due to space limit
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the old reason is in principle still true so why delete?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's not the primary reason why we don't want caching here, right? I thought we explicitly want to trigger to full installation pipeline etc (including newest package versions etc) in order to have a true e2e test? While the old reason might still be valid, it doesn't matter here. Or how would you even want me to express that? I guess not like "... to perform a full end-to-end test. Even if we didn't care about e2e, environments would still not be cached due to space limits." 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it was certainly one of the reasons and still is, if this action creates lots of caches it might delete caches form CI because theres a limit, just keep it and add your reason, no reason to delete

@Scienfitz
Copy link
Collaborator

Scienfitz commented Jul 3, 2024

closing due to #298

@Scienfitz Scienfitz closed this Jul 3, 2024
AdrianSosic added a commit that referenced this pull request Jul 3, 2024
Due to continuing serialization problems that were thought to be related
with caching, #277 deactivated core test caching and #294 was prepared
to do the same for the full test environment.

This PR reactivates caching and instead refactors the class layout of
`SKLearnClusteringRecommender` in an attempt to fix the root cause.
Mysteriously, the top-level import of `sklearn.mixture.GaussianMixture`
seems to cause trouble. While the reason is still unclear, turning it
into a lazy import (which will also become handy later when making
`scikit-learn` an optional dependency) seems to resolve the problem.

On a side note: deactivating slots for the recommenders solves the
problem as well, which suggests that the root cause could be related to
classes not being properly garbage collected (since `attrs` needs to
create new classes when slots are activated), which could also explain
that `GaussianMixtureClusteringRecommender` seemed to have improperly
overridden methods after deserialization (for example, the `__repr__` of
a created Gaussian mixture recommender correctly pointed to its own
class before serialization but to the `__repr__` of
`SKLearnClusteringRecommender` after serialization – but weirdly only
when executed in `tox`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
repo Requires changes to the project configuration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants