Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bazel 0.27 appears to no longer run on Trusty/RHEL/CentOS due to glibc compat #8652

Closed
mattklein123 opened this issue Jun 17, 2019 · 43 comments
Assignees
Labels
breakage P0 This is an emergency and more important than other current work. (Assignee required)

Comments

@mattklein123
Copy link

/usr/bin/bazel: relocation error: /usr/bin/bazel: symbol _ZTVNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEEE, version GLIBCXX_3.4.21 not defined in file libstdc++.so.6 with link time reference
make: *** [bazel_clean] Error 127

Is this expected or should Bazel still support these platforms?

@mattklein123
Copy link
Author

cc @moderation

@jin
Copy link
Member

jin commented Jun 17, 2019

cc @philwo @fweikert @laurentlb

@jin jin added the breakage label Jun 17, 2019
@jin
Copy link
Member

jin commented Jun 17, 2019

See #7816 (comment)

@mattklein123
Copy link
Author

Hmm, yikes. Dropping support for Trusty before EOL seems not great but maybe understandable, but I'm guessing that also dropping support for RHEL/CentOS is going to be very problematic for lots of people. Is there any chance of reconsidering this?

@moderation
Copy link

moderation commented Jun 17, 2019

Error on RHEL 7 is:

ldd --version
ldd (GNU libc) 2.17
./bazel-0.27.0-linux-x86_64 
./bazel-0.27.0-linux-x86_64: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./bazel-0.27.0-linux-x86_64)
./bazel-0.27.0-linux-x86_64: /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by ./bazel-0.27.0-linux-x86_64)
./bazel-0.27.0-linux-x86_64: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by ./bazel-0.27.0-linux-x86_64)

@laurentlb laurentlb added the P0 This is an emergency and more important than other current work. (Assignee required) label Jun 17, 2019
@laurentlb
Copy link
Contributor

There was a discussion here: https://groups.google.com/d/msg/bazel-dev/_D6XzfNkQQE/8TNKiNmsCAAJ

It looks like the change is breaking more things than intended. I'll wait for Philipp's feedback, but I suspect we'll do a new release to fix this.

@mzeren-vmw
Copy link
Contributor

FWIW our internal build of bazel links libstdc++ statically to resolve this type of issue.

@mattklein123
Copy link
Author

FWIW our internal build of bazel links libstdc++ statically to resolve this type of issue.

+1 it would be awesome if the bazel official builds could do this also.

@philwo
Copy link
Member

philwo commented Jun 17, 2019

tl;dr I'll fix this and will add official support for CentOS (we'll have to see which versions of CentOS we can do, but 7.x and 8.x sound very likely - does anyone still need CentOS 6.x?).

@mzeren-vmw, could you share your patches (if any) / build command-line for Bazel that you use to statically link libstdc++? Maybe we can just use the same for our official releases.

It looks like the change is breaking more things than intended.

The change dropped official support for Ubuntu 14.04 LTS and instead made Ubuntu 16.04 LTS our main build platform to build Linux release artifacts on. It was unknown at the time which platforms that would break. The reasoning was that Ubuntu 14.04 LTS standard support ended, which makes it "de facto" end of life for the majority of users, because package repositories with security updates can now only be accessed via a paid subscription, which isn't an option for our CI.

Users were asked on their opinion regarding this and we built several release candidates of Bazel 0.27.0 to learn about the larger impact of this change. I'm not aware of anyone complaining until after the final release.

Dropping support for Trusty before EOL seems not great but maybe understandable, [...]

Yes. We have to strike a balance here and going with the standard support period of the most popular operating systems seems like a good option for now.

but I'm guessing that also dropping support for RHEL/CentOS is going to be very problematic for lots of people. Is there any chance of reconsidering this?

This is unfortunate and I'll improve the current situation in two steps:

  • Build Linux releases in a way that let's them work on supported RHEL / CentOS versions.
  • Make RHEL / CentOS a fully supported OS that we test Bazel on.

It would have been nice if people could have reported breakages earlier, while 0.27.0 was still in the release candidate phase (or even earlier, after the initial announcement that we'll build Bazel on Ubuntu 16.04 LTS in the future, but I understand that it's hard to extrapolate "it will stop working on CentOS 7.x" from that), but now let's see that we can get this to work in 0.28.0 and hopefully in a 0.27.1 patch release.

@moderation
Copy link

@philwo I'd be happy to test release candidate builds on RHEL. Where is the best place to track rc releases? They are not published on https://github.com/bazelbuild/bazel/releases and I don't see announcements on https://groups.google.com/forum/#!forum/bazel-dev

@philwo
Copy link
Member

philwo commented Jun 17, 2019

@moderation Thank you! I'll see that we can upload release candidates as pre-releases to GitHub.

They are currently announced on the bazel-discuss group, e.g. see here: https://groups.google.com/forum/#!topic/bazel-discuss/9BFvKuEsEbg.

Bazelisk also supports a "last_rc" version which will automatically use the newest release candidate if one is available, otherwise the latest stable release. I'm using this on my personal and work machine and will ask the Bazel team to use it, so that the release candidates get some more real world testing.

@mattklein123
Copy link
Author

@philwo FWIW, I think the static link fixes would also fix Trusty, at least in the short term, with the understanding it is not fully supported. We would really, really appreciate it if you could do this there also.

@gennadiycivil
Copy link

This is affecting googletest builds as wellSucceeding: bazel (0.26.1)
Failing: bazel (0.27.0)
https://travis-ci.org/google/googletest/builds/546825101
https://travis-ci.org/google/googletest/jobs/546825103

@philwo
Copy link
Member

philwo commented Jun 17, 2019

@mattklein123 I'm happy to do the static linking as soon as someone tells me how. I can also do a quick manual test that the resulting binaries work on Trusty and CentOS. :)

@philwo
Copy link
Member

philwo commented Jun 17, 2019

This is affecting googletest builds as wellSucceeding: bazel (0.26.1)
Failing: bazel (0.27.0)

@gennadiycivil Travis CI uses Ubuntu 14.04 by default, but also supports Ubuntu 16.04. You (or the googletest owners) will have to update their Travis CI config to use Ubuntu 16.04.

Does that work for you?

@mattklein123
Copy link
Author

Ultimately it comes down to passing ["-static-libstdc++", "-static-libgcc"] to compilation. I'm not familiar with the bazel build though. Hopefully @mzeren-vmw can provide a diff.

@gennadiycivil
Copy link

This is affecting googletest builds as wellSucceeding: bazel (0.26.1)
Failing: bazel (0.27.0)

@gennadiycivil Travis CI uses Ubuntu 14.04 by default, but also supports Ubuntu 16.04. You (or the googletest owners) will have to update their Travis CI config to use Ubuntu 16.04.

Does that work for you?

Yes I will update relevant VMs to xenial.

@mzeren-vmw
Copy link
Contributor

mzeren-vmw commented Jun 17, 2019

Unfortunately, we do this by substituting a wrapper for CC and CXX which jams -l:libstdc++.a -lm on the command line. Not an upstream-able thing. :(.

@vbatts
Copy link

vbatts commented Jun 17, 2019

i would love for someone on the core team to own the fedora and RHEL builds 😸
Currently I build them: https://copr.fedorainfracloud.org/coprs/vbatts/bazel/

@lizan
Copy link

lizan commented Jun 17, 2019

@philwo @mattklein123 see the wrapper script in Envoy https://github.com/envoyproxy/envoy/blob/master/bazel/cc_wrapper.py, basically -static-libstdc++ doesn't work with gcc -x c++ but g++ though bazel does invoke former.

@philwo
Copy link
Member

philwo commented Jun 17, 2019

Thank you @mzeren-vmw and @lizan!

@hlopko Could you help me set up static linking of libstdc++ for Bazel’s own build using the current best practices? What’s your recommendation for this?

@philwo
Copy link
Member

philwo commented Jun 18, 2019

It seems like CentOS 7 has a slightly older glibc (2.17 vs. 2.19) and its libstdc++ is only newer by a patch-level bump (4.8.5 vs. 4.8.4) compared to Ubuntu 14.04. If we build our release artifacts on CentOS 7, the resulting binaries might work on all platforms that we intend to support.

Statically linking libstdc++ would increase the binary size by a few megabytes, which we'd like to avoid and also I'm wondering what would happen to our JNI library libunix.so - wouldn't that also need to (separately) statically link against libstdc++?

@hlopko
Copy link
Member

hlopko commented Jun 18, 2019

Re the size increase, its 47_434_606 bytes with statically linked libstdc++, and 45_176_063 with it being linked dynamically.

@hlopko
Copy link
Member

hlopko commented Jun 18, 2019

So, I'm not a big fan of linking against libstdc++ statically. What if we created additional released binaries for old libstdc++ in addition to our current binaries? I'm hopeful that this will work (didn't try):

bazel build //src:bazel -c opt --copt=-D_GLIBCXX_USE_CXX11_ABI=0 

Some background to why there is this libstdc++ incompatibility: https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html.

@lizan
Copy link

lizan commented Jun 18, 2019

@hlopko You don’t even need that define if you build on CentOS/RHEL7, the toolchain is forcing CXX11ABI off.

@emidln
Copy link

emidln commented Jun 18, 2019

Having multiple binaries would be unfortunate for users of tools like bazelbuild/bazelisk who happen to be on Centos7. I'm in this camp, and while I've built a forked bazel/bazelisk internally for my company in the past, I'd kinda like to avoid this in the future. Further, detecting the issue from bazelisk seems awkward too, although I suppose it could technically investigate the system libstdc++ to see if it has the correct symbols. Alternately, it could rely on the user setting an environment variable or a flag to pull in an alternately-built binary.

bazel-io pushed a commit that referenced this issue Jun 19, 2019
Using this variable is will be possible to specify system libraries that
should be added after default linker flags, after libraries to link, and
after user link flags (i.e. after --linkopt and `linkopts` rule attribute).

Flag separator is `:`, similarly to `BAZEL_LINKOPTS`. Escaping is done by
`%`.

Default value is empty string.

With this it's possible to force Bazel to statically link `libstdc++` by
using:

```
BAZEL_LINKOPTS=-static-libstdc++ BAZEL_LINKLIBS=-l%:libstdc++.a bazel build //foo
```

Fixes #2840.
Relevant for #8652.

RELNOTES: Bazel's C++ autoconfiguration now understands `BAZEL_LINKLIBS` environment variable to specify system libraries that should be appended to the link command line.

Closes #8660.

PiperOrigin-RevId: 253946433
@yongtang
Copy link

I think this might have an impact on TensorFlow as TF 1.14 is still built on Ubuntu 14.04. /cc @gunan @perfinion @angersson @martinwicke I remember the discussion about TF with manylinux2014 is to move to CentOS 7.0? Wondering if the timeline of 2.0 fit in this scenario.

@brianwa84
Copy link

@csuter this is the tracker for the TFP 0.27 issue

@gunan
Copy link

gunan commented Jun 21, 2019

For 1.14, we already set the max bazel version to 0.25.2:
https://github.com/tensorflow/tensorflow/blob/r1.14/configure.py#L1386

Even on master, 0.26.1 is the max version:
https://github.com/tensorflow/tensorflow/blob/master/configure.py#L53

So as far as I am concerned, this is a bazel 0.27 upgrade blocker, rather than a bug in TF build system.

@vbatts
Copy link

vbatts commented Jun 21, 2019

have you tried my centos build? https://copr.fedorainfracloud.org/coprs/vbatts/bazel/
I have 0.27 there now.

@yongtang
Copy link

@vbatts I tried with CentOS 7 + your bazel build and it works correctly. Thanks! 👍

@vbatts
Copy link

vbatts commented Jun 21, 2019

@yongtang good to hear

@r4nt
Copy link
Contributor

r4nt commented Jun 24, 2019

If I remember correctly, last time this happened I saw libc symbol problems, too, in which case statically linking libstdc++ won't help.

The simplest solution for now would probably be to use the manylinux2010 devtoolset gcc to compile bazel.

https://github.com/pypa/manylinux
image is at quay.io/pypa/manylinux2010_x86_64
@philwo - let me know if you need more info, feel free to drop by my desk

@philwo
Copy link
Member

philwo commented Jun 24, 2019

I got Bazel to build and pass its entire test suite on CentOS 7.

I’ll verify that the binaries work on all platforms that we supported before and then change our CI to build release binaries on CentOS 7 tomorrow.

@r4nt Thanks, I’ll have a look at that!

@philwo
Copy link
Member

philwo commented Jun 25, 2019

@r4nt FWIW, I cannot build Bazel inside that container, it fails with C++ errors:

ERROR: /bazel/src/main/tools/BUILD:50:1: Linking of rule '//src/main/tools:build-runfiles' failed (Exit 1): gcc failed: error executing command 
  (cd /tmp/bazel_KI4q5xoT/out/execroot/io_bazel && \
  exec env - \
    LD_LIBRARY_PATH=/opt/rh/devtoolset-8/root/usr/lib64:/opt/rh/devtoolset-8/root/usr/lib:/opt/rh/devtoolset-8/root/usr/lib64/dyninst:/opt/rh/devtoolset-8/root/usr/lib/dyninst:/usr/local/lib64:/usr/local/lib \
    PATH=/opt/rh/devtoolset-8/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
    PWD=/proc/self/cwd \
  /opt/rh/devtoolset-8/root/usr/bin/gcc @bazel-out/k8-opt/bin/src/main/tools/build-runfiles-2.params)
Execution platform: //:default_host_platform
bazel-out/k8-opt/bin/src/main/tools/_objs/build-runfiles/build-runfiles.o:build-runfiles.cc:function RunfilesCreator::ReadManifest(std::string const&, bool, bool): error: undefined reference to 'std::__throw_out_of_range_fmt(char const*, ...)'
collect2: error: ld returned 1 exit status
Target //src:bazel_nojdk failed to build

(I also can't build it in a vanilla CentOS 6 container with similar errors - I guess we just use a too new version of C++ that isn't supported by CentOS 6's libstdc++ and compilers yet?)

@perfinion
Copy link

@philwo devtoolset-8 should be a new enough GCC and libstdc++, its supposed to statically link in the missing bits but maybe that part is going wrong? (Note, devtoolset links to the proper centos7 libstdc++ what it can and only statically links the bits that are missing, it doesnt just statically link everything)

siddharthab pushed a commit to grailbio/bazel-compilation-database that referenced this issue Jun 25, 2019
14.04 is no longer supported by bazel bazelbuild/bazel#8652
@cgruber
Copy link
Contributor

cgruber commented Jul 1, 2019

Can someone on the bazel side please clarify what the plan of record is? If there's going to be a. 0.27.1 fix soon, then we don't have to address this the same way as if there won't. Even just knowing what's planned can help. I can't upgrade to 0.27.x without mitigating this somehow. Just upgrading our build machines is not a workable options for reasons I can't get in to.

@philwo
Copy link
Member

philwo commented Jul 1, 2019

@cgruber Please see #7816 (comment) and the announcement on the mailing list. Bazel 0.27.1 should be released in the next days. If you have any questions, let me know.

@cgruber
Copy link
Contributor

cgruber commented Jul 1, 2019

Oh! Sorry, I missed it. Thank you.

@philwo
Copy link
Member

philwo commented Jul 1, 2019

No worries 😊 I should have pointed it out here on the bug, too.

siberex pushed a commit to siberex/bazel that referenced this issue Jul 4, 2019
Using this variable is will be possible to specify system libraries that
should be added after default linker flags, after libraries to link, and
after user link flags (i.e. after --linkopt and `linkopts` rule attribute).

Flag separator is `:`, similarly to `BAZEL_LINKOPTS`. Escaping is done by
`%`.

Default value is empty string.

With this it's possible to force Bazel to statically link `libstdc++` by
using:

```
BAZEL_LINKOPTS=-static-libstdc++ BAZEL_LINKLIBS=-l%:libstdc++.a bazel build //foo
```

Fixes bazelbuild#2840.
Relevant for bazelbuild#8652.

RELNOTES: Bazel's C++ autoconfiguration now understands `BAZEL_LINKLIBS` environment variable to specify system libraries that should be appended to the link command line.

Closes bazelbuild#8660.

PiperOrigin-RevId: 253946433
@philwo philwo closed this as completed Jul 8, 2019
@philwo
Copy link
Member

philwo commented Jul 8, 2019

I'm closing this, because Bazel 0.27.1 is released and version from now on are built and tested on CentOS 7, so we should be good here. :)

irengrig pushed a commit to irengrig/bazel that referenced this issue Jul 15, 2019
Using this variable is will be possible to specify system libraries that
should be added after default linker flags, after libraries to link, and
after user link flags (i.e. after --linkopt and `linkopts` rule attribute).

Flag separator is `:`, similarly to `BAZEL_LINKOPTS`. Escaping is done by
`%`.

Default value is empty string.

With this it's possible to force Bazel to statically link `libstdc++` by
using:

```
BAZEL_LINKOPTS=-static-libstdc++ BAZEL_LINKLIBS=-l%:libstdc++.a bazel build //foo
```

Fixes bazelbuild#2840.
Relevant for bazelbuild#8652.

RELNOTES: Bazel's C++ autoconfiguration now understands `BAZEL_LINKLIBS` environment variable to specify system libraries that should be appended to the link command line.

Closes bazelbuild#8660.

PiperOrigin-RevId: 253946433
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breakage P0 This is an emergency and more important than other current work. (Assignee required)
Projects
None yet
Development

No branches or pull requests