Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out why bazel behaves weirdly (not recognizing changes to image after it is installed) #8915

Closed
jonathanmetzman opened this issue Nov 2, 2022 · 12 comments · Fixed by #9507
Assignees

Comments

@jonathanmetzman
Copy link
Contributor

Bazel for some reason doesn't know about changes to the images after it is installed. So in bc02fd0 when I stopped installing tons of unneeded pacakges, python3 stopped getting installed in base-builder. This ended up breaking upb (not sure why trial builds missed this) even though upb installs python3 in its own dockerfile, bazel couldn't find it.

@jonathanmetzman
Copy link
Contributor Author

Assigning to current sherrif.

@jonathanmetzman
Copy link
Contributor Author

Undo #8914 when this is fixed please.
A "fix" is installing python3 in base-builder but this isn't acceptable. Bazel should be able to use dependencies that are installed in project images.

@nareddyt
Copy link
Contributor

nareddyt commented Nov 3, 2022

Same error in #8909

@zhangskz
Copy link
Contributor

zhangskz commented Nov 4, 2022

I think this is affecting protobuf-python project as well. Adding same temporary fix as upb to protobuf-python/Dockerfile in #8930, but actually it looks like the sha256 isn't valid for gcr.io/oss-fuzz-base/base-builder-python. @jonathanmetzman can you assist here?

jonathanmetzman added a commit that referenced this issue Nov 7, 2022
Same temporary fix as #8914

Seems to be affected by #8915
per failures in
https://github.com/google/oss-fuzz/actions/runs/3396815368/jobs/5648300142

Co-authored-by: jonathanmetzman <31354670+jonathanmetzman@users.noreply.github.com>
@jonathanmetzman
Copy link
Contributor Author

Have you had a chance to look into this at all Navid?

@Navidem
Copy link
Contributor

Navidem commented Nov 22, 2022

Have you had a chance to look into this at all Navid?

Unfortunately I did not find time to investigate this.

@jonathanmetzman
Copy link
Contributor Author

CC @stefanbucur who worked on the bazel integration and may have ideas.
Also: maybe GOSST's bazel expert @mihaimaruseac knows about this.

@mihaimaruseac
Copy link
Member

Hmm, this is weird. I'll try to take a look tomorrow (after GUAC workshop ends today)

@stefanbucur
Copy link
Contributor

Sorry for the latency, will also take a look today or tomorrow unless Mihai beats me to it.

@mihaimaruseac
Copy link
Member

mihaimaruseac commented Jan 25, 2023

So I think the issue is partially due to bazelbuild/bazel#8685. Testing with protobuf, the error is from a bazel action trying to execute bazel-out/k8-opt-exec-2B5CBBC6/bin/external/rules_fuzzing/fuzzing/tools/make_corpus_dir which in both cases (with or without #8914) starts with

#!/usr/bin/env python3

Paradoxically, running python3 infra/helper.py shell upb and running /usr/bin/env python3 in either case results in a good invaction.

Looking deeper at the failing rule:

  (cd /root/.cache/bazel/_bazel_root/849200b88dc1111ed7bd040bc23f9dd8/execroot/upb && \
  exec env - \
  bazel-out/k8-opt-exec-2B5CBBC6/bin/external/rules_fuzzing/fuzzing/tools/make_corpus_dir '--output_dir=bazel-out/k8-opt/bin/upb/fuzz/file_descriptor_parsenew_fuzzer_corpus' '--corpus_list_file=bazel-out/k8-opt/bin/upb/fuzz/file_descriptor_parsenew_fuzzer_corpus-0.params')

This is probably because the exec env gets empty:

root@e77a2661452a:/src/upb# exec env - /usr/bin/env python3 
/usr/bin/env: 'python3': No such file or directory

But why? I don't know yet.

The specific build command that crashes is

bazel build -c opt --@rules_fuzzing//fuzzing:cc_engine=@rules_fuzzing_oss_fuzz//:oss_fuzz_engine --@rules_fuzzing//fuzzing:cc_engine_instrumentation=oss-fuzz --@rules_fuzzing//fuzzing:cc_engine_sanitizer=none --cxxopt=-stdlib=libc++ --linkopt=-lc++ --verbose_failures --spawn_strategy=standalone --action_env=CC=clang --action_env=CXX=clang++ -s //upb/fuzz:file_descriptor_parsenew_fuzzer_corpus

@mihaimaruseac
Copy link
Member

I think I found the issue:

On the good build (with #8914):

root@409cb65786b1:/src/upb# type -a python3
python3 is /usr/local/bin/python3
python3 is /usr/bin/python3
python3 is /bin/python3

On the broken one (with #8914 removed):

root@5ad1804970b8:/src/upb# type -a python3
python3 is /usr/local/bin/python3

Bazel executes commands in an empty environment using env -. This entails that $PATH is also different, set to some default.

Turns out, in the working environment /bin/python3 is the selected one, as can be seen from an strace:

execve("/bin/python3", ["python3"], 0x7ffe9e7f2d60 /* 0 vars */) = 0

The non working environment only searches /bin/ and /usr/bin, there is no python3 there:

execve("/bin/python3", ["python3"], 0x7ffe12c12b30 /* 0 vars */) = -1 ENOENT (No such file or directory)
execve("/usr/bin/python3", ["python3"], 0x7ffe12c12b30 /* 0 vars */) = -1 ENOENT (No such file or directory)

The 2 Pythons are also different:

root@409cb65786b1:/src/upb# /bin/python3 --version
Python 3.8.10
root@409cb65786b1:/src/upb# /usr/local/bin/python3 --version
Python 3.8.3

@mihaimaruseac
Copy link
Member

I have a fix in #9507

jonathanmetzman pushed a commit that referenced this issue Feb 6, 2023
The issue in #8915 is that the environment no longer has a leftover
`python3` binary in `/bin/python3`. This uncovers a bug in the `upb` and
`jwt-verify-lib` Dockerfiles where `python2` was installed (or no Python
was installed).

The issue seems to show up on Bazel projects only due to the way Bazel
executes commands: it uses `env -` to run them in a clear environment,
meaning that even `$PATH` is altered. Before bc02fd0 the issues in the
Dockerfiles were hidden by the fact that the environment contained
multiple versions of Python and one happened to be matched by this
search path.

This fixes #8915, reverting #8914 and #8909 tweaks to #8915. I did not
do a similar thing for #8930 as maybe that can be fixed by changing the
base python image?

Tested: Tested that I can build the `upb` fuzzers with this change.

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>
eamonnmcmanus pushed a commit to eamonnmcmanus/oss-fuzz that referenced this issue Mar 15, 2023
The issue in google#8915 is that the environment no longer has a leftover
`python3` binary in `/bin/python3`. This uncovers a bug in the `upb` and
`jwt-verify-lib` Dockerfiles where `python2` was installed (or no Python
was installed).

The issue seems to show up on Bazel projects only due to the way Bazel
executes commands: it uses `env -` to run them in a clear environment,
meaning that even `$PATH` is altered. Before bc02fd0 the issues in the
Dockerfiles were hidden by the fact that the environment contained
multiple versions of Python and one happened to be matched by this
search path.

This fixes google#8915, reverting google#8914 and google#8909 tweaks to google#8915. I did not
do a similar thing for google#8930 as maybe that can be fixed by changing the
base python image?

Tested: Tested that I can build the `upb` fuzzers with this change.

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants