Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching behavior related to PATH and use_default_shell_env are problematic #18809

Open
jbms opened this issue Jun 29, 2023 · 0 comments
Open
Labels
help wanted Someone outside the Bazel team could own this P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-CPP Issues for C++ rules type: bug

Comments

@jbms
Copy link

jbms commented Jun 29, 2023

Currently:

  • The Bazel action environment (which also affects its cache key) inherit PATH and LD_LIBRARY_PATH by default. If --incompatible_strict_action_env is specified, then a default PATH is used and LD_LIBRARY_PATH is not inherited. (Or these environment variables can be explicitly overridden via --action_env.)
  • The use_default_shell_env option defaults to False, which means by default commands are run with no PATH. If use_default_shell_env=True are specified, then the normal PATH and LD_LIBRARY_PATH values for the action_env are used.

This leads to the following problems:

  • When invoking Bazel from a Python setup.py script in order to build a Python extension with Bazel, as is done by tensorstore, the PATH may include temporary virtualenv directories created by pip. Even when using pip install -e . to install for local development purposes, pip will by default use an isolated build environment (which results in these temporary directories being added to the PATH in order to isolate build dependencies. This means that neither local nor remote caching works.
  • If we specify --incompatible_strict_action_env, then PATH is no longer passed to the action, and no longer affects the cache key, but this breaks building in environments/with toolchains that require PATH and/or LD_LIBRARY_PATH be set. For example, on Windows we may wish to use a self-contained version of MSVC or mingw where we haven't copied the various runtime DLLs to the Windows system directory. As a result, any binaries built will only run if the PATH is preserved so that they can locate the DLLs that they require (Windows does not have the equivalent of rpath). (See MinGW toolchain produces binaries that cannot be run with ctx.actions.run() by default #15059)
  • Even if we don't specify --incompatible_strict_action_env, a lot of rules still don't work with toolchains that require PATH to be set, because they invoke executables without setting use_default_shell_env=True. Since the rule author doesn't know the details of the toolchain or system environment, they cannot know that the default use_default_shell_env=False will be problematic.

I propose the following changes:

  • Introduce --exclude_action_env_from_cache=X to exclude a given action environment variable (if inherited or specified on the command-line, but not if specified in the rule directly) from the cache key. (Previously proposed here some years ago https://groups.google.com/g/bazel-discuss/c/AZek-OR1t7A)
  • Introduce --incompatible_exclude_path_from_action_cache_key. If set, this is equivalent to specifying --exclude_action_env_from_cache={PATH,LD_LIBRARY_PATH}.
  • Introduce --incompatible_use_default_shell_env which sets use_default_shell_env=True by default. This would require Add --incompatible_merge_fixed_and_default_shell_env #18235 to be practical.

As noted in https://groups.google.com/g/bazel-discuss/c/AZek-OR1t7A, including PATH in the action cache key does not ensure hermeticity anyway except in special cases such as when using nix, where the PATH is guaranteed to change if any of the executables change. In this rare case where including PATH in the cache key is helpful, --incompatible_exclude_path_from_action_cache_key=false can be specified by the user.

In #15059 an alternative to defaulting use_default_shell_env=True to solve the issues with mingw is proposed, that the toolchain should somehow add all of the runtime DLLs as runfiles. Note that the issue is not specific to mingw, though --- we may wish to build with a self-contained copy of MSVC (in fact we do this in the tensorstore CI build) and the same issue applies there, so this would basically be needed for all toolchains. This may work, but in general by breaking normal expectations of how the PATH should be handled, creates additional trouble for users. For example, if any other binaries used by actions need DLLs, those DLLs have to be copied to the same directory as the binary on Windows if PATH cannot be relied upon.

copybara-service bot pushed a commit to google/tensorstore that referenced this issue Jun 29, 2023
There is no specified way to detect the pip build environment,
so this relies on an empirical approach; when it works the bazel action
cache should be reused.

However this should improve the bazel action cache use and make
`pip install .` behave better.

See also: bazelbuild/bazel#18809

PiperOrigin-RevId: 544451090
Change-Id: Ibfbcfe5f6747786e090b460ea1531164fc9691df
@oquenchil oquenchil added P3 We're not considering working on this, but happy to review a PR. (No assignee) help wanted Someone outside the Bazel team could own this and removed untriaged labels Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Someone outside the Bazel team could own this P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-CPP Issues for C++ rules type: bug
Projects
None yet
Development

No branches or pull requests

3 participants