Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how rules_haskell caches artifacts #1293

Open
joneshf opened this issue Mar 26, 2020 · 3 comments
Open

Document how rules_haskell caches artifacts #1293

joneshf opened this issue Mar 26, 2020 · 3 comments
Assignees
Labels
P3 minor: not priorized type: documentation

Comments

@joneshf
Copy link
Contributor

joneshf commented Mar 26, 2020

Is your feature request related to a problem? Please describe.

I've found caching with rules_haskell to be hard to figure out. I'm using rules_haskell in a project I'm working on. I run locally on my Linux machine and (mostly due to inertia) on three CI services: AppVeyor (Windows), GitLab CI (Linux), and TravisCI (Linux and macOS). I use the GHC bindists, as setting up nix across all platforms is not really in the cards. In each environment, something seems to miss the cache. It happens in different ways whether it's locally or in one of the CI services.

Locally, it seems to intermittently check if the stack stuff is up to date. I can't figure out how it decides to check this (so can't produce output at the moment). But every so often, I'll see it checking for updates of stack stuff. It's not a big issue locally, but it means that sometimes a bazel test //... is subsecond and other times it's 10-20 seconds. Aside from this check, things seem to work locally.

On the CI services, I cannot figure out how to cache the rules_haskell stuff correctly. The project has a transitive dependency on happy, which means it has to be handled differently from other Haskell dependencies. Each CI service seems unable to cache the result of happy and has to build it every time it's run. This means that each build on CI is about five minutes, even if nothing changed.

I've tried caching the typical bazel directories: ~/.cache/bazel on Linux (and equivalents on macOS and Windows). I've tried using the --disk_cache and --repository_cache flags to set the locations explicitly. None of these things seem to make caching work on the CI services.

Describe the solution you'd like

There's two parts to this:

  1. It would be nice to document how to setup rules_haskell correctly so it doesn't intermittently check for stack updates. I imagine there's something I don't have setup properly, but I don't know what it could be.
  2. It would be nice to document how to setup rules_haskell correctly so it hits caches. Again, I imagine there's something I don't have setup properly, but don't know what it could be.

If there's some argument or flag that has to be turned on for either of these, maybe it could be flipped to be opt-out instead of opt-in so caching worked out of the box?

@aherrmann
Copy link
Member

Locally, it seems to intermittently check if the stack stuff is up to date.

We call stack update as a local repository rule, i.e. this is run whenever Bazel re-fetches. This is to work around a race on a lock within stack. Slower refetch is an unfortunate side effect of this. However, it shouldn't invalidate the cache for Stackage dependencies that are already cached.

On the CI services, I cannot figure out how to cache the rules_haskell stuff correctly. The project has a transitive dependency on happy, which means it has to be handled differently from other Haskell dependencies. Each CI service seems unable to cache the result of happy and has to build it every time it's run. This means that each build on CI is about five minutes, even if nothing changed.

It's hard to say in general what's wrong here. There is a known reproducibility issue with haskell_cabal_binary|library, but this should not affect the caching within a CI pipeline, only across environments (e.g. shared remote cache between CI and devs). Are you aware of any changes between CI runs within one CI pipeline? E.g. different usernames, different working directories, different PATH, etc.? A good way to debug this is to compare execution logs of two runs that should be identical, following the steps described here.

I've tried caching the typical bazel directories: ~/.cache/bazel on Linux (and equivalents on macOS and Windows). I've tried using the --disk_cache and --repository_cache flags to set the locations explicitly. None of these things seem to make caching work on the CI services.

This suggests that something is changing in the environment and leaking into the cache keys. Temporary working or installation directories can have that effect, in particular if there are build steps that set use_default_shell_env = True, or repository rules that depend on PATH. In bindist mode that is the case for the POSIX and Python toolchains used by rules_haskell. Comparing execution logs as described above should help pinpoint this. Additionally Bazel allows to debug for inhermeticity in workspace rules.

I use the GHC bindists, as setting up nix across all platforms is not really in the cards.

It's possible to configure rules_haskell to work with Nix on Linux and MacOS while using the bindist on Windows. The rules_haskell repository itself does that. Nix makes it much easier to achieve reproducible builds and also the Bazel that comes with Nix includes some patches to improve reproducibility. It may well be worth the effort if the Linux and MacOS use-case allows for it.

@joneshf
Copy link
Contributor Author

joneshf commented Mar 31, 2020

We call stack update as a local repository rule, i.e. this is run whenever Bazel re-fetches. This is to work around a race on a lock within stack. Slower refetch is an unfortunate side effect of this. However, it shouldn't invalidate the cache for Stackage dependencies that are already cached.

Yeah, it seems to hit the cache. Just didn't know for sure if it was checking because I did something wrong. Glad to know that's how it's supposed to work!

Thanks for the suggestions on where to go. Will look into them when I get some time to diag.

joneshf added a commit to joneshf/purty that referenced this issue Nov 15, 2020
We worked around the sub-package issue, so we can use a newer version.

This version brings a welcomed fix: better caching of the stack setup.
This was mentioned in the GitHub issue:
tweag/rules_haskell#1293. It seems like using
the `stack_snapshot_json` attribute might address all of the issues
rraised in that GitHub issue. If it only addresses the local
intermittent checks it would be a win by itself.

We'll try it out, and see where we get.
joneshf added a commit to joneshf/purty that referenced this issue Nov 15, 2020
This should address some of the caching issues we run into.

As mentioned in this GitHub issue:
tweag/rules_haskell#1293, there are issues
with `stack` running intermittently locally, and with caching not really
working on any of the CI systems we use. It looks like the
`stack_snapshot_json` argument should help with that.

After working with it for a little while, it looks like we consistently
get sub-second `bazel test //...` when it should be a no-op! This is the
behavior we wanted, and it's here!

We'll have to see how this plays out on CI.

We also add a rule in the `Makefile` to help with regenerating that
pinned file when we make changes.
joneshf added a commit to joneshf/purty that referenced this issue Nov 15, 2020
We worked around the sub-package issue, so we can use a newer version.

This version brings a welcomed fix: better caching of the stack setup.
This was mentioned in the GitHub issue:
tweag/rules_haskell#1293. It seems like using
the `stack_snapshot_json` attribute might address all of the issues
rraised in that GitHub issue. If it only addresses the local
intermittent checks it would be a win by itself.

We'll try it out, and see where we get.
joneshf added a commit to joneshf/purty that referenced this issue Nov 15, 2020
This should address some of the caching issues we run into.

As mentioned in this GitHub issue:
tweag/rules_haskell#1293, there are issues
with `stack` running intermittently locally, and with caching not really
working on any of the CI systems we use. It looks like the
`stack_snapshot_json` argument should help with that.

After working with it for a little while, it looks like we consistently
get sub-second `bazel test //...` when it should be a no-op! This is the
behavior we wanted, and it's here!

We'll have to see how this plays out on CI.

We also add a rule in the `Makefile` to help with regenerating that
pinned file when we make changes.
joneshf added a commit to joneshf/purty that referenced this issue Dec 5, 2020
We want to get up to date with our dependencies.

This version brings a welcomed fix: better caching of the stack setup.
This was mentioned in the GitHub issue:
tweag/rules_haskell#1293. It seems like using
the `stack_snapshot_json` attribute might address all of the issues
rraised in that GitHub issue. If it only addresses the local
intermittent checks it would be a win by itself.

We'll try it out, and see where we get.
Zelenya pushed a commit to Zelenya/purty that referenced this issue Feb 26, 2021
We worked around the sub-package issue, so we can use a newer version.

This version brings a welcomed fix: better caching of the stack setup.
This was mentioned in the GitHub issue:
tweag/rules_haskell#1293. It seems like using
the `stack_snapshot_json` attribute might address all of the issues
rraised in that GitHub issue. If it only addresses the local
intermittent checks it would be a win by itself.

We'll try it out, and see where we get.
@aherrmann aherrmann added the P3 minor: not priorized label Feb 21, 2022
@aherrmann
Copy link
Member

A caching section in the use-cases documentation would be a good place for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 minor: not priorized type: documentation
Projects
None yet
Development

No branches or pull requests

3 participants