Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python wheels impose vague and undocumented restrictions on the use of C++ features #1963

Open
olupton opened this issue Aug 22, 2022 · 5 comments

Comments

@olupton
Copy link
Collaborator

olupton commented Aug 22, 2022

Overview

While working on #1922 I started seeing linker errors:

nrnmain.cpp:(.text.startup+0x98): undefined reference to `nrnmpi_load[abi:cxx11](int)

in the CI jobs that test the Linux Python wheels (which are also built as part of the CI). The relevant change in that PR was that I changed the return type of that function from char* to std::string.

In general, replacing C-style strings and manual memory management with standard types such as std::string is a Good Thing ™️, so it's unfortunate that this causes problems with use cases that we consider important (pip install neuron on Linux...). If supporting these use cases forbids the use of certain C++ features or types in certain places, this should be made explicit and documented.

Detailed description

The problem is that the Python wheels are built using an image based on manylinux2014 that is based on CentOS7, and that this only supports using a pre-C++11 ABI for std::string (link, link, ...). This means that the old ABI is used for code in libnrniv.so, which is shipped inside the wheels.

The typical workflow of running nrnivmodl after pip install neuron on a "user" machine builds C++ sources using the compiler toolchain that is installed on that "user" machine, which must support at least C++17 and will default to using a C++11-compatible std::string implementation. This causes link errors if any code compiled on the user machine tries to refer to symbols inside libnrniv.so that use std::string (for example, a function that returns std::string).

Trying to force the use of the old ABI on user machines inside nrnivmodl would probably be fragile, and would not avoid issues with linking to other software on the user machine that uses the new ABI. Dropping support for manylinux2014 would(?) cause issues with pip install neuron using vanilla Python 3.7.x.

These issues can be avoided by only using a restricted subset of C++ in functions/variables that may be called from code compiled on the user machine, but there is no(?) well-defined list of these functions/variables or enforcement mechanism beyond hoping that the wheel-based CI pipelines catch any infractions.

Further information

Some other links that may be of interest:

@olupton
Copy link
Collaborator Author

olupton commented Sep 20, 2022

This came up again in #1929, where an intermediate commit had a non-inline version of https://github.com/neuronsimulator/nrn/blob/4ac3ed9e46a29af193c807c50c101f88537a83e1/src/nrniv/backtrace_utils.h#L13-L24 built into libnrniv.so that caused linker errors downstream when using the Python wheels and nrnivmodl.

@olupton
Copy link
Collaborator Author

olupton commented Jan 6, 2023

The latest manifestation of this in #2027 involves a class with an std::string member.

struct foo {
  std::string m_name;
  short m_number;
};

which cannot safely be passed between code compiled as part of the wheel (libnrniv.so) and inside nrnivmodl -- specifically attempting to access m_number leads to junk results.

@olupton
Copy link
Collaborator Author

olupton commented Jun 5, 2023

Note that in BlueBrain/libsonata#273 we seem to have come across another, this time std::regex-related, incarnation of ABI issues when running a mix of manylinux2014 and modern compiled code in a single process.

@JCGoran
Copy link
Collaborator

JCGoran commented Jan 10, 2025

A lot of time has passed since this issue was originally opened, so I thought I'd provide some updates:

  • BB5, which ran CentOS 7 (i.e. glibc 2.17), has been shut down because BBP ended at the end of 2024
  • Python 3.7 and Python 3.8 have reached EOL, and have since been dropped from NEURON 9
  • existing EPFL HPC infra is running at least RHEL 9, which has glibc 2.34
  • Open Brain Institute (OBI) has runners with at least glibc 2.34

The (supported) successor to manylinux2014, the base image we are using for building NEURON wheels on Linux, is manylinux_2_28.
manylinux_2_28 is based on Almalinux 8, and comes with glibc 2.28 and at least GCC 8, and the value of _GLIBCXX_USE_CXX11_ABI is finally 1 on manylinux_2_28, which means that we may finally be able to get rid of the dual ABI issue by switching to manylinux_2_28. Note that this would invariably drop support for some Linux distributions, since the minimum distro versions for wheels built with this image are:

  • Debian 10+
  • Ubuntu 18.10+
  • Fedora 29+
  • CentOS/RHEL 8+

For reference, here is a shell oneliner to get the value of _GLIBCXX_USE_CXX11_ABI:

printf '#include <iostream>\nint main(){std::cout<<_GLIBCXX_USE_CXX11_ABI;}' | g++ -x c++ - && ./a.out

This prints 0 on manylinux2014, and 1 on manylinux_2_28.

@nrnhines are you aware of any other users that would be interested in running NEURON 9 via Python wheels (I specifically mention wheels since the recommended way to install NEURON on a cluster is using Spack or similar because wheels are unable to take advantage of almost any hardware-specific features that clusters may offer) on systems older than the above?
If not, I would propose we make the switch for NEURON 9 to use manylinux_2_28. As an added bonus, this switch would open up the possibility of upgrading the minimum required C++ standard since manylinux_2_28 comes with GCC 14 (supports almost all C++20/many C++23 features).

UPDATE: the upgrading of the C++ standard will probably need to wait a bit since GCC is not forwards compatible, but in #3306 I managed to get NEURON wheels which shouldn't experience any dual ABI issues.

@nrnhines
Copy link
Member

@nrnhines are you aware of any other users that ...

No. (Though that does not mean there aren't any :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants