-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support volatile processes. #10768
Support volatile processes. #10768
Conversation
This just wraps up the current UUID hack behind a type to prevent the surface area of the hack from spreading. If we eventually plumb a volatility flag through Process to the Rust side we can still do so with a small change to the current UUID env var hack and a controlled deprecation of the VolatileProcess type in favor of the Process flag should it come to exist. [ci skip-rust] [ci skip-build-wheels]
src/python/pants/engine/process.py
Outdated
# could be gone tomorrow. Ideally we'd only do this for local processes since all known | ||
# remoting configurations include a static container image as part of their cache key which | ||
# automatically avoids this problem. | ||
VolatileProcess( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaaaand ... this is still broken in realistic ways. Namely, upgrading or downgrading a binary found here previously will result in the same output from this rule and thus not trigger a re-run of rules depending on this data, which is wrong. The only cases currently covered by this fix from the list below are 1 and 2:
- An applicable binary is added to the search path.
- An applicable binary is removed from the search path.
- An applicable binary on the search path is modified (upgraded or downgraded most likely).
To handle 3 the contents of the discovered binaries need to be hashed (And even that is not enough! Dynamically linked libraries could change the binaries output ... but we'll need to whistle past the graveyard - that level of detail can really only be addressed by always running in an image / using a local docker / podman / runc / crun type solution). This could be done by storing it in the CAS as a hack, but we won't / can't generally use the binary stored in the CAS so that just adds more (see make_process_volatile
comment) storage bloat. Ideally we'd ask an engine intrinsic to get us a fingerprint without storing the object. It could also be done by just hashing the resulting absolute path in python rule code at the expense of potentially heavy IO inhibiting parallelism. On my machine, hashing my biggest binary (/usr/bin/packer @ 154M) takes ~400ms in pure python code and hashing the currently more typical python binary (@ 19M) takes ~50ms.
I'd like to defer handling case 3 to a follow-up if folks are amenable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Tom had suggested hashing the binaries, as well. My original design was going to update the find_binary
script to also run shasum -a 256
if available, then to store a private field _hashes
on the BinaryPaths
object. Call sites wouldn't use this, but it would result in the cache becoming invalidated if the hashes change, even if the paths remain the same.
I don't think we need to block on #3. By definition, fixing #1 and #2 will be an improvement from what we have right now.
We should track this, though. It could result in really subtle issues for users that require them knowing to purge lmdb_store
..
--
It gets even worse for Pyenv, which was the whole motivation of this fix. ~/.pyenv/shims/python
is a Bash script. It will always have the same hash, regardless of what interpreter is used. The only "fixes" I could imagine are a) calling pyenv which python
somehow, or b) telling users to stop using Pyenv how you're supposed to, e.g. don't use pyenv global
and instead explicitly list the .pyenv/versions
on their PATH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think shasum if available is not a great idea. Even relying on bash and optional which as is is bad and TODO'd for replacement with a native binary for each platform we support.
Pyenv shims would be solved IIUC by something this should already be doing, which is allowing a test function. Really a test argv where each found binary is executed as argv0 + argvTest
. After all, the found binary is just name matched. It may be broken, it may be malicious, etc. A test function at least provides a nominal check the binary works and the result of the output of test function, if any, could be mixed into the hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll follow up with this change and temporarily push hashing into the check process with an issue broken out to separate this out. Since python is an interpreter we only happen to be able to do fancy checks with it - for other binaries that won't be possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The follow-up is in #10770.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good call to reduce the surface area. This is a good change for our plugin API, too. We know at least one user wants to run Git processes for their plugin, which should never be cached.
How about "uncachable" instead of "volatile"? I acknowledge "volatile" has prior art with C/C++/Java. However, in an ideal world, to avoid storage bloat, we will never cache an uncachable process in the first place. Personally, I think "uncachable" is simpler English than having to be familiar with "volatile" from those languages.
Uncacheable sgtm. |
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
This just wraps up the current UUID hack behind a type to prevent the
surface area of the hack from spreading. If we eventually plumb a
volatility flag through Process to the Rust side we can still do so with
a small change to the current UUID env var hack and a controlled
deprecation of the VolatileProcess type in favor of the Process flag
should it come to exist.
[ci skip-rust]
[ci skip-build-wheels]