Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSX wheels aren't compiled with OpenMP support #4889

Open
cphyc opened this issue Apr 29, 2024 · 5 comments
Open

OSX wheels aren't compiled with OpenMP support #4889

cphyc opened this issue Apr 29, 2024 · 5 comments

Comments

@cphyc
Copy link
Member

cphyc commented Apr 29, 2024

Bug report

Bug summary

OSX wheels aren't compiled with OpenMP support (as can be read from the log of the wheels compilations). The default clang compiler on OSX doesn't seem to support OpenMP compilation as far as I could tell (but I do not have an OSX machine to confirm/infirm), so extending the documentation to compile with OpenMP support would be useful. In any case, I've had users report that installing from source does not compile with OpenMP support.

Code for reproduction

  • Operating System: OSX on x64 and arm
  • Python Version: any
  • yt version: any
  • Other Libraries (if applicable): OpenMP
@nastasha-w
Copy link
Contributor

Ugh, so that's why that error looked familiar! I'm using an M1 mac, but somehow yt is working. I'm not specifically setting anything to use OpenMP there though, since I'm only running tiny test cases.
I can confirm that Mac's default compiler, clang, does not support OpenMP. Worse, gcc on the command line is aliased to clang, at least by default. (I have not tried turning that off, so I don't know if it's possible.) I have previously been able to compile C code with OpenMP on my mac, but that was by (1) installing gcc e.g., from homebrew, and (2) compiling C code by calling the specific gcc version, e.g., gcc-13 instead of gcc. This was, however, pure C code that I then called from python with ctypes, and I think I might actually have had to give up getting that to work on the M1 mac. (The 'clang not supporting OpenMP' issue is not new.)
I'm not sure if there's a way to get python to look up if there's a real gcc version on a system and to use that instead. Overall, I suppose apple had some reason for aliasing gcc to clang, but it's a real pain when it doesn't actually support some of gcc's features.

@nastasha-w
Copy link
Contributor

... so I checked, and I can confirm that although the C code compiles (and I can run a test C program from the command line), I get a similar issue to the one Jack reported if I try to call the .so file from python:
OSError: dlopen(/Users/nastasha/code/proj-an-c/interp2d/interp.so, 0x0006): tried: '/Users/nastasha/code/proj-an-c/interp2d/interp.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/nastasha/code/proj-an-c/interp2d/interp.so' (no such file), '/Users/nastasha/code/proj-an-c/interp2d/interp.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64'))
The .so file was compiled with gcc-13, specifically Homebrew GCC 13.2.0. By the way, I had to deactivate conda before it would even compile due to some linking library issue, but I did activate conda again to run from python.
Overall, this issue seems to be a general headache on Macs. This stackexchange thread seems to have some ideas: https://stackoverflow.com/questions/28010801/compiling-parallelized-cython-with-clang , but it requires installing libraries on your own computer first, and adding a bunch of stuff on the command line. I don't get the impression that there's a straightforward way to get this to work 'off the shelf'.
Honestly, my own 'solution' has been to do small bits of C-from-python development on the login nodes of the university linux cluster. I run all my production analysis on linux-managed clusters anyway.

@neutrinoceros
Copy link
Member

I'm using an M1 mac, but somehow yt is working. I'm not specifically setting anything to use OpenMP there though, since I'm only running tiny test cases.

Building for mac arm64 without OpenMP is definitely something we've been exercising (and we've been publishing wheels for it since yt 4.0.4), so as long as you're not trying to enable it, yt is expected to build and run correctly on this arch (albeit at sub-optimal performance).

@nastasha-w
Copy link
Contributor

Ugh, yeah this mostly makes me wish I trusted myself to maintain a linux system

@neutrinoceros
Copy link
Member

neutrinoceros commented Apr 30, 2024

I dug a little bit (with @cphyc's help) and found a reasonably painless way to build yt with OpenMP on this platform

# test_omp.sh

set -euxo pipefail

brew install gcc
export CXX=g++-13
export CC=gcc-13

rm -fr .venv | true
python -m venv .venv
source .venv/bin/activate

python -m pip install build

git clone https://github.com/yt-project/ewah_bool_utils.git _ewah_bool_utils
pushd _ewah_bool_utils
rm -fr dist | true
python -m build --wheel
python -m pip install dist/*.whl
popd

git clone https://github.com/yt-project/yt.git _yt
pushd _yt
rm -fr dist | true
python -m build --wheel
python -m pip install dist/*.whl
popd

python -m pip install pandas h5py pooch
OMP_NUM_THREADS=4 python t.py
OMP_NUM_THREADS=2 python t.py
# t.py
import yt
from time import monotonic_ns
import os
from tqdm import tqdm

ds = yt.load_sample("output_00080")

NREP=10
tstart = monotonic_ns()
for i in tqdm(range(NREP)):
    p = yt.ProjectionPlot(ds, [1, 1, 1], ("gas", "density"))
    p.render()
tstop = monotonic_ns()
dt = (tstop-tstart) / 1e9 # in s


print(f"Took {dt:.1e} s ({dt/NREP:.1e} s/it)", end="")
if (omp_num_threads:=os.environ.get("OMP_NUM_THREADS")) is not None:
    print(f" using {omp_num_threads} OpenMP threads")
else:
    print()

However, because this technique involves dynamically linking libgomp I got from homebrew (/opt/homebrew/opt/gcc/lib/gcc/current/libgomp.1.dylib), the resulting wheel isn't portable, so we cannot apply this to the publishing process.
As noted by @cphyc, portability may be addressable on the conda-forge side (if we're not doing it already).
Meanwhile, we could document this technique, but we need to know whether this is also an issue with conda-forge binaries first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants