Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ci): enable pip cache #198

Merged
merged 4 commits into from
Mar 24, 2023
Merged

feat(ci): enable pip cache #198

merged 4 commits into from
Mar 24, 2023

Conversation

SauravMaheshkar
Copy link
Contributor

This PR sets the cache and cache-dependency-path argument of actions/setup-python to enable pip caching.

@younesbelkada younesbelkada requested a review from lvwerra March 6, 2023 13:13
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 6, 2023

The documentation is not available anymore as the PR was closed or merged.

@lvwerra
Copy link
Member

lvwerra commented Mar 6, 2023

Hi @SauravMaheshkar, thanks for opening a PR! If I understand it correctly this integration hashes the requirements/setup files to cache dependencies. In that case we would be quite blind to breaking changes from new versions of a library, right?

@SauravMaheshkar
Copy link
Contributor Author

@lvwerra as far as I understand, the action reuses cache if the files listed under the cache-dependency-path haven't been changed for while. If the files have been recently changed then it updates the cache. So we should not be blind to breaking changes from new versions.

@lvwerra
Copy link
Member

lvwerra commented Mar 13, 2023

I think the issue is that we don't change the files in question often so it is possible that there are breaking changes that we don't notice because we never updated the cached files. Is there an option to set a cache lifetime?

@SauravMaheshkar
Copy link
Contributor Author

AFAIK there isn't a option to set cache lifetime however we can use the cache-hit flag to check if a cache hit has occurred. Source

@lvwerra
Copy link
Member

lvwerra commented Mar 13, 2023

Hi @SauravMaheshkar, not sure I follow how the cache-hit would help. I would like to at least clear the cache once every 1-2 days since libraries might be updated that the we won't install due to caching.

@SauravMaheshkar
Copy link
Contributor Author

Hey @lvwerra thanks for the feedback, you do make an interesting case. Upon searching around for a bit, we can create another workflow similar to the one in the Github Docs which can clear all the caches every n days.

@SauravMaheshkar
Copy link
Contributor Author

Do you want me to work on that as well in this PR ?

@lvwerra
Copy link
Member

lvwerra commented Mar 16, 2023

Hi @SauravMaheshkar, yes, if we can delete the cache every 1-2 days adding caching would be a good feature.

@SauravMaheshkar
Copy link
Contributor Author

@lvwerra I added a proposal for a workflow in e19ce20

@lvwerra
Copy link
Member

lvwerra commented Mar 21, 2023

Awesome work @SauravMaheshkar , that looks great to me! What do you think @younesbelkada ?

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this awesome feature!
I am totally ok for this PR, I just have few (basic) questions

  • Where the duration of the cache refreshing mechanism (you stated each 1 or 2 days) can be changed on the file you have added?
  • Why the changes on tests.yml are needed? It seems to be applied only on python 3.8, do we need to change it as well for other python versions?
    Thanks a lot!

@lvwerra
Copy link
Member

lvwerra commented Mar 21, 2023

For the first question - cron: "0 0 * * *" specifies refresh at midnight (see here). Also curious about the second one!

@younesbelkada
Copy link
Contributor

younesbelkada commented Mar 21, 2023

For the first question - cron: "0 0 * * *" specifies refresh at midnight (see here).

Awesome thank you!

@SauravMaheshkar
Copy link
Contributor Author

Thanks a lot for this awesome feature! I am totally ok for this PR, I just have few (basic) questions

  • Where the duration of the cache refreshing mechanism (you stated each 1 or 2 days) can be changed on the file you have added?
  • Why the changes on tests.yml are needed? It seems to be applied only on python 3.8, do we need to change it as well for other python versions?
    Thanks a lot!

We can run the tests on other python versions as well. The changes proposed by this PR is independent of the python version. We can test for more python versions as well.

@younesbelkada
Copy link
Contributor

younesbelkada commented Mar 21, 2023

Thanks a lot for clarifying!

The changes proposed by this PR is independent of the python version.

I am a bit confused by this, can you elaborate a bit more on that?

Also I can see that the modified block on tests.yml only touches the check-code-quality workflow. Ultimately my question is if you can point me to the place that ensures the caching refreshing mechanism is triggered on the CI runners?

Again thank you!

@lvwerra
Copy link
Member

lvwerra commented Mar 23, 2023

Hi @SauravMaheshkar just to double check: it seems like the cache is only a pip download cache and the libraries still need to be installed (which takes a significant amount of time). Is this expected?

@SauravMaheshkar
Copy link
Contributor Author

Hi @SauravMaheshkar just to double check: it seems like the cache is only a pip download cache and the libraries still need to be installed (which takes a significant amount of time). Is this expected?

Yes it's just a download cache AFAIK

@lvwerra lvwerra merged commit 89df6ab into huggingface:main Mar 24, 2023
@SauravMaheshkar SauravMaheshkar deleted the ci-pip-cache branch March 24, 2023 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants