-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Differentiate and propagate missing digest errors #15761
Merged
stuhood
merged 3 commits into
pantsbuild:main
from
stuhood:stuhood/propagate-missing-digest-errors
Jun 7, 2022
Merged
Differentiate and propagate missing digest errors #15761
stuhood
merged 3 commits into
pantsbuild:main
from
stuhood:stuhood/propagate-missing-digest-errors
Jun 7, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ocess_execution` and `nodes`. [ci skip-build-wheels]
[ci skip-build-wheels]
tdyas
approved these changes
Jun 7, 2022
This was referenced Jun 7, 2022
Merged
stuhood
added a commit
that referenced
this pull request
Jun 17, 2022
…ore resilient (#15850) As described in #11331, in order to avoid having to deal with missing remote content later in the pipeline, `--remote-cache-eager-fetch` currently defaults to true. This means that before calling a cache hit a hit, we fully download the output of the cache entry. In warm-cache situations, this can mean downloading a lot more than is strictly necessary. In theory, you could imagine `eager_fetch=False` downloading only stdio and no file content at all for a 100% cache hit rate run of tests. In practice, high hitrate runs [see about 80% fewer bytes downloaded, and 50% fewer RPCs](#11331 (comment)) than with `eager_fetch=True`. To begin moving toward disabling `eager_fetch` by default (and eventually, ideally, removing the flag entirely), this change begins "backtracking" when missing digests are encountered. Backtracking is implemented by "catching" `MissingDigest` errors (introduced in #15761), and invalidating their source `Node` in the graph. When a `Node` that produced a missing digest re-runs, it does so using progressively fewer caches (as introduced in #15854), in order to cache bust both local and remote partial cache entries. `eager_fetch=False` was already experimental, in that any `MissingDigest` error encountered later in the run would kill the entire run. Backtracking makes `eager_fetch=False` less experimental, in that we are now very likely to recover from a `MissingDigest` error. But it is still the case with `eager_fetch=False` that persistent remote infrastructure errors (those that last longer than our retry budget or timeout) could kill a run. Given that, we will likely want to gain more experience and further tune timeouts and retries before changing the default. Fixes #11331. [ci skip-build-wheels]
stuhood
added a commit
that referenced
this pull request
Jun 29, 2022
…15979) As reported in #15954, backtracking doesn't currently work when a synchronous method like `Workspace.write_digests` is the source of a `MissingDigest` error. This was due to a TODO left behind in #15761, where we did not propagate the `Failure` type "through" `@rule` bodies. To fix this, we add a conversion from `Failure` to `PrErr` which wraps in a well known exception type, and then look for that type when converting back from `PyErr` to `Failure`. Fixes #15954.
tdyas
pushed a commit
to tdyas/pants
that referenced
this pull request
Jun 29, 2022
…antsbuild#15979) As reported in pantsbuild#15954, backtracking doesn't currently work when a synchronous method like `Workspace.write_digests` is the source of a `MissingDigest` error. This was due to a TODO left behind in pantsbuild#15761, where we did not propagate the `Failure` type "through" `@rule` bodies. To fix this, we add a conversion from `Failure` to `PrErr` which wraps in a well known exception type, and then look for that type when converting back from `PyErr` to `Failure`. Fixes pantsbuild#15954. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
stuhood
added a commit
to stuhood/pants
that referenced
this pull request
Jun 29, 2022
…antsbuild#15979) As reported in pantsbuild#15954, backtracking doesn't currently work when a synchronous method like `Workspace.write_digests` is the source of a `MissingDigest` error. This was due to a TODO left behind in pantsbuild#15761, where we did not propagate the `Failure` type "through" `@rule` bodies. To fix this, we add a conversion from `Failure` to `PrErr` which wraps in a well known exception type, and then look for that type when converting back from `PyErr` to `Failure`. Fixes pantsbuild#15954. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
stuhood
added a commit
that referenced
this pull request
Jun 29, 2022
…Cherry-pick of #15979) (#16001) As reported in #15954, backtracking doesn't currently work when a synchronous method like `Workspace.write_digests` is the source of a `MissingDigest` error. This was due to a TODO left behind in #15761, where we did not propagate the `Failure` type "through" `@rule` bodies. To fix this, we add a conversion from `Failure` to `PrErr` which wraps in a well known exception type, and then look for that type when converting back from `PyErr` to `Failure`. Fixes #15954. [ci skip-build-wheels] Co-authored-by: Stu Hood <stuhood@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In order to begin lazily fetching digests for #11331, we need to be able to differentiate errors caused by missing digests.
This change introduces
{StoreError,ProcessError,Failure}::MissingDigest
, which are bubbled up to the level of graphNode
s, and which will allow us to introduce retry in a followup change.MissingDigest
is produced byload_bytes_with
and a few other low-levelStore
methods (which now return anErr(MissingDigest)
rather thanOption
).There should not be any behavior changes.