Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --incompatible_sandbox_hermetic_tmp #16336

Closed
wants to merge 1 commit into from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Sep 24, 2022

With the new flag, each Linux sandbox will have its own dedicated empty
directory mounted as /tmp rather than sharing /tmp with the host
filesystem.

This is necessary since the Linux sandbox uses a PID namespace, but many
tools (e.g. the JVM) create files at well-known locations such as
/tmp/.javapid${PID}, which leads to collisions between different
sandboxes and the host. These collisions can result in crashes or
surprising situations such as Java agents being attached to a JVM in a
different sandbox.

With the flag enabled, --sandbox_add_mount_pair=/tmp can be used to
restore the old behavior of a non-hermetic /tmp directory.

This is made possible by a small change to linux-sandbox which allows
mount pair source directories to also be marked as writable directories.

Work towards #3236

@fmeum fmeum force-pushed the 3236-hermetic-sandbox branch 4 times, most recently from c870104 to d984b32 Compare September 25, 2022 08:32
@fmeum
Copy link
Collaborator Author

fmeum commented Sep 25, 2022

@larsrc-google @philwo Would you be able to take a look? This realizes #3236 (comment) as an incompatible flag as I believe these kind of bugs should not occur with default settings. Happy to discuss alternative ways to ship this though.

@fmeum fmeum marked this pull request as ready for review September 25, 2022 08:50
@sgowroji sgowroji added the team-Local-Exec Issues and PRs for the Execution (Local) team label Sep 26, 2022
@gtech-bazel-bot gtech-bazel-bot bot added the awaiting-review PR is awaiting review from an assigned reviewer label Sep 26, 2022
@fmeum fmeum changed the title WIP: Add --incompatible_sandbox_hermetic_tmp Add --incompatible_sandbox_hermetic_tmp Sep 26, 2022
@@ -283,9 +289,15 @@ protected ImmutableSet<Path> getWritableDirs(Path sandboxExecRoot, Map<String, S
}

private SortedMap<Path, Path> getReadOnlyBindMounts(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just me, or is this very poorly named?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just you, it was very poorly named after my changes. Fixed it.

// host filesystem's /tmp. User-specified bind mounts can override this and use the host's
// /tmp instead by mounting /tmp to /tmp, if desired.
bindMounts.put(tmpPath, sandboxTmp);
}
if (blazeDirs.getWorkspace().startsWith(tmpPath)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these two following remounts work with hermetic tmp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as far as I can tell. I used the commands from test_write_hermetic_tmp to run it within /tmp and elsewhere, and it works elsewhere but fails within /tmp:

$ /tmp/bazel-hermeticsandbox --bazelrc=/dev/null  test pkg:tmp_test --spawn_strategy=sandboxed --incompatible_sandbox_hermetic_tmp --sandbox_debug --verbose_failures
INFO: Analyzed target //pkg:tmp_test (1 packages loaded, 2 targets configured).
INFO: Found 1 test target...
ERROR: /tmp/hermetictmptest/pkg/BUILD:1:8: Testing //pkg:tmp_test failed: (Exit 1): linux-sandbox failed: error executing command 
  (cd /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/5aee95154b6da9eb556a8fb39d46f41c/sandbox/linux-sandbox/8/execroot/__main__ && \
  exec env - \
    EXPERIMENTAL_SPLIT_XML_GENERATION=1 \
    JAVA_RUNFILES=bazel-out/k8-fastbuild/bin/pkg/tmp_test.runfiles \
    PATH=/usr/local/google/home/larsrc/.local/bin:/usr/local/google/home/larsrc/bin:/usr/lib/google-golang/bin:/usr/local/buildtools/java/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/intellij-idea/bin:/usr/local/google/home/larsrc/bin \
    PYTHON_RUNFILES=bazel-out/k8-fastbuild/bin/pkg/tmp_test.runfiles \
    RUNFILES_DIR=bazel-out/k8-fastbuild/bin/pkg/tmp_test.runfiles \
    RUN_UNDER_RUNFILES=1 \
    TEST_BINARY=pkg/tmp_test \
    TEST_INFRASTRUCTURE_FAILURE_FILE=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.infrastructure_failure \
    TEST_LOGSPLITTER_OUTPUT_FILE=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.raw_splitlogs/test.splitlogs \
    TEST_NAME=//pkg:tmp_test \
    TEST_PREMATURE_EXIT_FILE=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.exited_prematurely \
    TEST_SHARD_INDEX=0 \
    TEST_SIZE=medium \
    TEST_SRCDIR=bazel-out/k8-fastbuild/bin/pkg/tmp_test.runfiles \
    TEST_TARGET=//pkg:tmp_test \
    TEST_TIMEOUT=300 \
    TEST_TMPDIR=_tmp/dce7d97fa01edd02eaaa2ea9bd7d11f5 \
    TEST_TOTAL_SHARDS=0 \
    TEST_UNDECLARED_OUTPUTS_ANNOTATIONS=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.outputs_manifest/ANNOTATIONS \
    TEST_UNDECLARED_OUTPUTS_ANNOTATIONS_DIR=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.outputs_manifest \
    TEST_UNDECLARED_OUTPUTS_DIR=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.outputs \
    TEST_UNDECLARED_OUTPUTS_MANIFEST=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.outputs_manifest/MANIFEST \
    TEST_UNDECLARED_OUTPUTS_ZIP=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.outputs/outputs.zip \
    TEST_UNUSED_RUNFILES_LOG_FILE=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.unused_runfiles_log \
    TEST_WARNINGS_OUTPUT_FILE=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.warnings \
    TEST_WORKSPACE=__main__ \
    TMPDIR=/tmp \
    TZ=UTC \
    XML_OUTPUT_FILE=bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.xml \
  /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/install/cd380c01deb6ae3b758a4940bc515c99/linux-sandbox -t 15 -w /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/5aee95154b6da9eb556a8fb39d46f41c/sandbox/linux-sandbox/8/execroot/__main__ -w /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/5aee95154b6da9eb556a8fb39d46f41c/sandbox/linux-sandbox/8/execroot/__main__/_tmp/dce7d97fa01edd02eaaa2ea9bd7d11f5 -w /tmp -w /dev/shm -e /tmp -M /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/5aee95154b6da9eb556a8fb39d46f41c/sandbox/linux-sandbox/8/_tmp -m /tmp -M /tmp/hermetictmptest -S /usr/local/google/home/larsrc/.cache/bazel/_bazel_larsrc/5aee95154b6da9eb556a8fb39d46f41c/sandbox/linux-sandbox/8/stats.out -D -- external/bazel_tools/tools/test/generate-xml.sh bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.log bazel-out/k8-fastbuild/testlogs/pkg/tmp_test/test.xml 0 1)
Target //pkg:tmp_test up-to-date:
  bazel-bin/pkg/tmp_test
INFO: Elapsed time: 0.259s, Critical Path: 0.02s
INFO: 3 processes: 3 internal.
FAILED: Build did NOT complete successfully
//pkg:tmp_test                                                        NO STATUS

FAILED: Build did NOT complete successfully

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, that doesn't seem to work. I think that getting this to work would be possible but slightly involved: It requires mounting the workspace/output base somewhere else first, then mounting /tmp, then mounting that "somewhere else" into the expected paths. That "somewhere else" may require cleanup though.

Do you think that this would be worth the effort? Alternatively, I could disable hermetic /tmp in this case and emit a warning once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for now, let's disallow hermetic /tmp when inside /tmp, whether by disabling it with a warning or maybe downright failing. I would expect it to be rare.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a warning and am now also taking --sandbox_tmpfs_path=/tmp into account.

src/main/tools/linux-sandbox-pid1.cc Show resolved Hide resolved
@fmeum fmeum force-pushed the 3236-hermetic-sandbox branch 4 times, most recently from 21c5265 to 5094411 Compare September 26, 2022 20:05
@@ -178,6 +184,12 @@ protected SandboxedSpawn prepareSpawn(Spawn spawn, SpawnExecutionContext context
sandboxExecRoot.getParentDirectory().createDirectory();
sandboxExecRoot.createDirectory();

Path sandboxTmp = null;
if (getSandboxOptions().sandboxHermeticTmp && !getSandboxOptions().sandboxTmpfsPath.contains(PathFragment.create("/tmp"))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check for starting with /tmp (or rather either being /tmp or starting with /tmp/) instead of containing /tmp? Not sure what would happen if --sandbox_tmpfs_path is /tmp/foobar.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check for this as well as a descriptive warning. I think we should be able to support it by sorting the tmpfs paths into the bind mounts, but would prefer to ignore this edge case for now.

.findFirst();
// Mounting a tmpfs strictly below the hermetic /tmp isn't supported. Mounting a tmpfs on /tmp
// makes mounting the disk-based hermetic /tmp unnecessary.
if (tmpfsPathUnderTmp.isPresent() && !getSandboxOptions().sandboxTmpfsPath.contains(tmpRoot)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ifs for the warning and the actual logic should not be separate. It would be better to have one if with the logic and an inner if with just the warnedAboutNonHermeticTmp check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rearranged the control flow and also moved the comments around. PTAL.

@fmeum fmeum force-pushed the 3236-hermetic-sandbox branch 2 times, most recently from 4f3852b to 65105dd Compare October 4, 2022 11:41
@larsrc-google
Copy link
Contributor

Looks nice now. With this much new logic, could you add some unit tests in LinuxSandboxedSpawnRunnerTest that the various options lead to the appropriate directories and command line arguments?

With the new flag, each Linux sandbox will have its own dedicated empty
directory mounted as /tmp rather than sharing /tmp with the host
filesystem.

This is necessary since the Linux sandbox uses a PID namespace, but many
tools (e.g. the JVM) create files at well-known locations such as
`/tmp/.javapid${PID}`, which leads to collisions between different
sandboxes and the host. These collisions can result in crashes or
surprising situations such as Java agents being attached to a JVM in a
different sandbox.

With the flag enabled, `--sandbox_add_mount_pair=/tmp` can be used to
restore the old behavior of a non-hermetic /tmp directory.

This is made possible by a small change to linux-sandbox which allows
mount pair source directories to also be marked as writable directories.
@fmeum
Copy link
Collaborator Author

fmeum commented Oct 7, 2022

Looks nice now. With this much new logic, could you add some unit tests in LinuxSandboxedSpawnRunnerTest that the various options lead to the appropriate directories and command line arguments?

I added unit tests for three scenarios. I wasn't able to figure out how to:

  1. test for the warnings (events.assertContainsWarning never succeeds even if the warning is generated)
  2. test with output base or workspace under /tmp.

@larsrc-google In case you want me to add coverage for any of these cases, I would need some hlep.

@fmeum fmeum requested review from larsrc-google and removed request for benjaminp October 13, 2022 08:00
EdSchouten pushed a commit to EdSchouten/bazel that referenced this pull request Oct 17, 2022
With the new flag, each Linux sandbox will have its own dedicated empty
directory mounted as `/tmp` rather than sharing `/tmp` with the host
filesystem.

This is necessary since the Linux sandbox uses a PID namespace, but many
tools (e.g. the JVM) create files at well-known locations such as
`/tmp/.javapid${PID}`, which leads to collisions between different
sandboxes and the host. These collisions can result in crashes or
surprising situations such as Java agents being attached to a JVM in a
different sandbox.

With the flag enabled, `--sandbox_add_mount_pair=/tmp` can be used to
restore the old behavior of a non-hermetic `/tmp` directory.

This is made possible by a small change to linux-sandbox which allows
mount pair source directories to also be marked as writable directories.

Work towards bazelbuild#3236

Closes bazelbuild#16336.

PiperOrigin-RevId: 481570131
Change-Id: I01b654a1f4b0223a8f272cf644e23a7d0572ea09
@manuelnaranjo
Copy link

@fmeum you may be interested in my recent findings, I was trying this feature for some yarn stuff that otherwise fails to build and is hard to migrate to rules_js, so we call yarn directly, only that we do it inside a sandbox, thing is when I started enabling this flag I started getting a weird bug about files not existing:

1694676026.955057621: src/main/tools/linux-sandbox.cc:197: done manipulating pipes
1694676026.955168864: src/main/tools/linux-sandbox-pid1.cc:285: working dir: /home/mnaranjo/.cache/bazel/_bazel_mnaranjo/4e6d4fec8727eb9e92cb2d52e0ffc3c1/sandbox/linux-sandbox/27/execroot/booking_core_main
1694676026.955205993: src/main/tools/linux-sandbox-pid1.cc:312: bind mount: /home/mnaranjo/.cache/bazel/_bazel_mnaranjo/4e6d4fec8727eb9e92cb2d52e0ffc3c1/sandbox/linux-sandbox/27/_tmp -> /tmp
src/main/tools/linux-sandbox-pid1.cc:314: "mount(/home/mnaranjo/.cache/bazel/_bazel_mnaranjo/4e6d4fec8727eb9e92cb2d52e0ffc3c1/sandbox/linux-sandbox/27/_tmp, /tmp, nullptr, MS_BIND, nullptr)": No such file or directory
1694676026.973839738: src/main/tools/linux-sandbox.cc:233: child exited normally with code 1

I managed to create a repro repository https://github.com/bookingcom/test-sandbox-hermetic-tmp, turns out it's a weird interaction with --experimental_reuse_sandbox_directories, I checked and it doesn't have with 7.0.0-pre.20230906.2 so it's fixed upstream, don't think it's worth to backport the fix for 6.3, but I believe documenting my findings may help others as well

@fmeum fmeum deleted the 3236-hermetic-sandbox branch September 14, 2023 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer team-Local-Exec Issues and PRs for the Execution (Local) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants