-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ETXTBSY during Coursier fetch #13424
Comments
I'm pretty sure this is because the executable is not in the chroot (bash): pants/src/python/pants/jvm/resolve/coursier_fetch.py Lines 361 to 376 in 710170f
But my mitigation of yore looks explicitly for that condition to trigger fancy retries: pants/src/rust/engine/process_execution/src/local.rs Lines 361 to 418 in 044af4a
It's probably the case that we just always need to use the mitigation unconditionally. |
Some of our Processes use system binaries to run scripts in the input digest of the Process. This foiled the original ETXTBSY fix which was only used when argv0 was a member of the input digest. Coursier fetch was an example of this, using the system provided bash to execute a script in the input digest which in turn executed a binary in the input digest. Fix this by just pessimistically assuming all Process invocations will execute files in the input digest and always performing the mitigation. Fixes pantsbuild#13424 [ci skip-build-wheels]
Hrm, I just realized always applying the ETXTBSY mitigation won't help this case at all. To clarify this case, we have the following spawn chain:
In other words; although this fix originates from the pants engine code and mitigates the top level process encountering ETXTBSY, it does nothing for the child The only idea I have is to add an intrinsic shim launcher binary that can be requested as part of an input digest that contains the same code as the relevant part of the Pants engine. When that binary is present in the input digest, the pants engine executes it noop using the mitigation to prove its viable; then proceeds with the user process invocation. Inside the user process, instead of invoking processes directly in bash, they use the shim launcher binary. So instead of |
I guess - more simply - the spawn shim could be requested in the Process struct and then the launch from Rust would be of |
As background for other readers: Longer-term fixes in the Linux kernel still seem to face skepticism: I found the most recent attempt to get a Maybe we should start advocating for that change? Maybe even team up with other build systems (e.g., Bazel)? |
For a near-term solution, would the following work (at least on Linux)? Spawn a thread pool that will be used for writing out files to execution sandboxes. When the threads first boot, they call the For macOS (or other UNIX systems without an There is some prior art for this: Buildbox (used by BuildGrid) has the notion of a "Local CAS Service" which includes sandbox setup via a An alternate solution might just adopt the gRPC approach instead of using unshared threads. Pants will spawn a separate sandbox-setup process which receives gRPC requests to setup sandboxes. This might assist with using FUSE for sandbox setup by delegating the whole sandbox setup to a separate process which can choose how to setup the sandbox. |
I like the separate process idea. There is 0 ambiguity then and it works for all OSes we'll support. |
Doing this work from dedicated threads or a separate process sounds feasible. But I think that at least for the next few weeks we should likely hack in a workaround until we've finished stabilizing the JVM: essentially, edit the wrapper scripts to include a bash retry loop. That's obviously not scalable, but the board is pretty full: https://github.com/pantsbuild/pants/projects/22?fullscreen=true , and I'd like to clear up a few more unknowns before investing time here. |
#13848 should have made this issue much less frequent, since we will materialize the |
…ies. (#14812) The `exclusive_spawn` facility to avoid/retry for `ExecutableFileBusy` / "Text file busy" is triggered by having materialized `arg[0]` of a process into the process sandbox. But in the presence of a `Process::working_directory` and a relative path as `arg[0]`, the facility was not being triggered (since validation of `arg[0]` as a `RelativePath` would fail due to it escaping its root). This relates to #13424 (which would remove the need for the `exclusive_spawn` facility), but does not fix it: only ensure that we handle an existing known case. [ci skip-build-wheels]
…ies. (pantsbuild#14812) The `exclusive_spawn` facility to avoid/retry for `ExecutableFileBusy` / "Text file busy" is triggered by having materialized `arg[0]` of a process into the process sandbox. But in the presence of a `Process::working_directory` and a relative path as `arg[0]`, the facility was not being triggered (since validation of `arg[0]` as a `RelativePath` would fail due to it escaping its root). This relates to pantsbuild#13424 (which would remove the need for the `exclusive_spawn` facility), but does not fix it: only ensure that we handle an existing known case. [ci skip-build-wheels]
…ies. (cherrypick of #14812) (#14816) The `exclusive_spawn` facility to avoid/retry for `ExecutableFileBusy` / "Text file busy" is triggered by having materialized `arg[0]` of a process into the process sandbox. But in the presence of a `Process::working_directory` and a relative path as `arg[0]`, the facility was not being triggered (since validation of `arg[0]` as a `RelativePath` would fail due to it escaping its root). This relates to #13424 (which would remove the need for the `exclusive_spawn` facility), but does not fix it: only ensure that we handle an existing known case. [ci skip-build-wheels]
Observed on main @ cbee9ff
The text was updated successfully, but these errors were encountered: