Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add download and environment caching #38

Merged
merged 9 commits into from
Feb 24, 2022

Conversation

jonashaag
Copy link
Contributor

This builds upon (and includes) #37.

Adds options to either cache entire environments (envs/myenv directory) or downloads (pkgs directory).

Sorry that this has become essentially a rewrite of most of the code.

@wolfv
Copy link
Member

wolfv commented Feb 8, 2022

I don't mind a rewrite at all if everything keeps working ;)

@jonashaag
Copy link
Contributor Author

The thing is that we don't have perfect test coverage here (also I'm not sure how to properly test actions). I will dogfood this to make sure it works, maybe you can do that as well.

@wolfv
Copy link
Member

wolfv commented Feb 8, 2022

@jonashaag Happy to merge this tomorrow morning -- then we can have a look throughout the day if things break for someone.

I was wondering if we should document caching in the Readme as well? That could be cool.

@jonashaag
Copy link
Contributor Author

@wolfv if you have the time I'd also appreciate a code review.

@wolfv
Copy link
Member

wolfv commented Feb 10, 2022

Looks like the env cache is working like a charm! 30s vs 3s, congrats!

Do you know why that is skipped on the last test case?

@jonashaag
Copy link
Contributor Author

ping @wolfv :)

@@ -5,21 +5,58 @@ branding:
color: "green"
inputs:
environment-file:
description: "the environment.yml file for the conda environment"
description: >-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these rendered somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope

@@ -2,104 +2,116 @@

[![test](https://github.com/mamba-org/provision-with-micromamba/workflows/test/badge.svg)](https://github.com/mamba-org/provision-with-micromamba/actions?query=workflow%3Atest)

GitHub Action to provision a CI instance using micromamba
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the inputs here, are the option descriptions rendered somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, do you want it back? I think it's ok to have people look at the action.yml file but I'm also OK with duplicating the docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I think it's nice to have the overview of supported settings right there in the readme. I didn't really know about the action.yml file...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will bring it back.

extra-specs: |
python=3.7
pytest=${{ matrix.pytest }}
```

## Example with download caching

Use `cache-downloads` to enable download caching across action runs (`.tar.bz2` files).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add up- and downsides of either way of caching? Which works better for which use case?

Copy link
Contributor Author

@jonashaag jonashaag Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not quite sure about those, given that we haven't used it a lot so far. My thoughts:

  • Env caching should always work OK given how Conda works, and assuming that tar + untar doesn't lose file system permissions etc. One caveat is that you might have older versions of dependencies but that is limited to the cache TTL (currently hardcoded to 1 day). If you make any changes to your Conda env folder for whatever horrible reason, the changes are included in the cached version and your build may break.
  • Download caching is generally slower because it only eliminates the download time. Actually it doesn't eliminate it, only (presumably) reduce it because it's loaded from a "faster" network (Azure Blob storage?!) Potentially it's even slower than an uncached install because download + extraction are possibly done with less parallelism. In practice it always seems to be faster though. Advantage of download caching is that you are always up to date.

Other ideas?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonashaag is the micromamba action also compatible with using Github's cache v3 action to cache mamba env?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience it seems that cache-environment: true leads to the environment being cache for a lot longer than a day: at least a week, probably longer.
This leads to updates of dependencies not being installed/updated in nightly test runs, which is a problem, so I had to disable it: geofileops/geofileops#381

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that this action is deprecated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that this action is deprecated

I'm using setup-micromamba@v1, but I was sent here via the link in its README.md, below When to use environment caching

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... I found the following issue that reports the same issue I'm facing: mamba-org/setup-micromamba#50

@wolfv
Copy link
Member

wolfv commented Feb 17, 2022

yep, I read through it all and I am really happy with this! The only thing I was thinking about was the different cache options, if we should give them both to the users -- but probably that's fine! We could say which one we consider the best for what workflow.

@wolfv
Copy link
Member

wolfv commented Feb 24, 2022

Looks like some invalid YAML somewhere? Error: Input does not meet YAML 1.2 "Core Schema" specification: cache-env-always-update

@jonashaag
Copy link
Contributor Author

@wolfv fixed!

@wolfv wolfv merged commit 1876988 into mamba-org:main Feb 24, 2022
@jonashaag
Copy link
Contributor Author

🎉

@wolfv
Copy link
Member

wolfv commented Feb 24, 2022

Awesome stuff, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants