Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statically link compiler-rt #986

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Statically link compiler-rt #986

wants to merge 14 commits into from

Conversation

oliverhu
Copy link

This PR addesses issues in #834 and tinygrad/tinygrad#1367, in which libllvmlite.so fails to resolve symbols in libclang_rt.builtin.a since compiler rt is not linked. This PR statically links the compiler_rt library.

(address the comments from #976)

ffi/CMakeLists.txt Outdated Show resolved Hide resolved
ffi/build.py Outdated Show resolved Hide resolved
ffi/build.py Outdated Show resolved Hide resolved
Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building locally on Linux gives me:

$python setup.py develop
... output snipped ...
Traceback (most recent call last):
  File "/home/gmarkall/numbadev/llvmlite/ffi/build.py", line 244, in <module>
    main()
  File "/home/gmarkall/numbadev/llvmlite/ffi/build.py", line 234, in main
    main_posix('linux', '.so')
  File "/home/gmarkall/numbadev/llvmlite/ffi/build.py", line 226, in main_posix
    shutil.copy('libclang_rt.builtins.a', target_dir)
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'libclang_rt.builtins.a'
error: command '/home/gmarkall/mambaforge/envs/numbadev/bin/python' failed with exit code 1

If I apply:

diff --git a/ffi/build.py b/ffi/build.py
index a012bb5..4176630 100755
--- a/ffi/build.py
+++ b/ffi/build.py
@@ -101,15 +101,7 @@ def main_windows():
     # Run configuration step
     try_cmake(here_dir, build_dir, *generator)
     subprocess.check_call(['cmake', '--build', build_dir, '--config', config])
-    try:
-        shutil.copy(os.path.join(build_dir, config, 'llvmlite.dll'), target_dir)
-    except shutil.SameFileError:
-        pass
-
-    try:
-        shutil.copy(os.path.join(build_dir, 'clang_rt.builtins.lib'), target_dir)
-    except shutil.SameFileError:
-        pass
+    shutil.copy(os.path.join(build_dir, config, 'llvmlite.dll'), target_dir)
 
 
 def main_posix_cmake(kind, library_ext):
@@ -223,7 +215,6 @@ def main_posix(kind, library_ext):
     makeopts = os.environ.get('LLVMLITE_MAKEOPTS', default_makeopts).split()
     subprocess.check_call(['make', '-f', makefile] + makeopts)
     shutil.copy('libllvmlite' + library_ext, target_dir)
-    shutil.copy('libclang_rt.builtins.a', target_dir)
 
 
 def main():

(which removes things I was confused about - I'm not saying they're not required, but I'm yet to understand their purpose) then the build completes, but I get an error running tests (so none of the binding tests run):

$ python runtests.py 
E...................................................................................................................................................................................................
======================================================================
ERROR: llvmlite.tests.test_binding (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: llvmlite.tests.test_binding
Traceback (most recent call last):
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/binding/ffi.py", line 136, in __getattr__
    return self._fntab[name]
KeyError: 'LLVMPY_AddSymbol'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/binding/ffi.py", line 119, in _load_lib
    _ = self._lib_handle.__gnu_f2h_ieee()
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/ctypes/__init__.py", line 387, in __getattr__
    func = self.__getitem__(name)
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/ctypes/__init__.py", line 392, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /home/gmarkall/numbadev/llvmlite/llvmlite/binding/libllvmlite.so: undefined symbol: _lib_wrapper__gnu_f2h_ieee

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/unittest/loader.py", line 436, in _find_test_path
    module = self._get_module_from_name(name)
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/unittest/loader.py", line 377, in _get_module_from_name
    __import__(name)
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/tests/test_binding.py", line 17, in <module>
    from llvmlite import binding as llvm
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/binding/__init__.py", line 4, in <module>
    from .dylib import *
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/binding/dylib.py", line 36, in <module>
    ffi.lib.LLVMPY_AddSymbol.argtypes = [
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/binding/ffi.py", line 139, in __getattr__
    cfn = getattr(self._lib, name)
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/binding/ffi.py", line 131, in _lib
    self._load_lib()
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/binding/ffi.py", line 125, in _load_lib
    raise OSError("Could not find/load shared object file") from e
OSError: Could not find/load shared object file


----------------------------------------------------------------------
Ran 196 tests in 0.046s

FAILED (errors=1)

Maybe this is because the code I removed is needed, but had some other issue? Or maybe the issue is elsewhere?

@oliverhu
Copy link
Author

oliverhu commented Aug 26, 2023

@gmarkall found the problem with the test...the issue is related to cdll loading logic, left a comment in the code: https://github.com/numba/llvmlite/pull/986/files#diff-a23ac0d3c4c6d8093d709eb3c5d28fc4dadae4e58e648ba306485a7f4eddd776R119

@oliverhu
Copy link
Author

After more investigation... I reverted the build change for macOS & Windows, and added a test for Linux.

Issues for macOS:

  1. For M1 Macbook, there is native fp16 register so I can't reproduce the missing symbol issue for __gnu_f2h_ieee.
  2. The clang linker in macOS behaves different than GNU. For example: libclang_rt.osx.a has visible symbols ___divtc3, verified via otool as well and there is no hidden flag.
nm /Users/khu/miniconda3/envs/rt/lib/clang/14.0.6/lib/darwin/libclang_rt.osx.a | grep ___divtc3
0000000000000000 T ___divtc3

However, linker complains the ___divtc3 symbol is hidden if I try to export that symbol..

❯ clang++ -dynamiclib -o dummy.dylib  -L/Users/khu/miniconda3/envs/rt/lib -Wl,-search_paths_first  /Users/khu/miniconda3/envs/rt/lib/clang/14.0.6/lib/darwin/libclang_rt.osx.a  -Wl,-exported_symbol,___divtc3
ld: warning: cannot export hidden symbol ___divtc3 from /Users/khu/miniconda3/envs/rt/lib/clang/14.0.6/lib/darwin/libclang_rt.osx.a(divtc3.c.o)

Issues for Windows

I'm mainly struggling with dev tooling to figure out what is missing, applied the following diff ( /WHOLEARCHIVE is the Windows equivalent of --no-whole-archive):

diff --git a/ffi/CMakeLists.txt b/ffi/CMakeLists.txt
index 07ab5ec..c6b843e 100755
--- a/ffi/CMakeLists.txt
+++ b/ffi/CMakeLists.txt
@@ -23,7 +23,7 @@ else()
 endif()

 if (${CMAKE_SYSTEM_NAME} MATCHES "Windows")
-    set(COMPILER_RT_LOCATION ${LLVM_LIBRARY_DIR}/windows/clang_rt.builtins-${ARCH}.lib)
+    set(COMPILER_RT_LOCATION ${LLVM_LIBRARY_DIR}/clang/${LLVM_PACKAGE_VERSION}/lib/windows/clang_rt.builtins-${ARCH}.lib)
 else()
     if (EXISTS "$(LLVM_LIBDIR)/clang")
         set(COMPILER_RT_LOCATION ${LLVM_LIBRARY_DIR}/linux/libclang_rt.builtins-${ARCH}.a)
@@ -65,8 +65,12 @@ add_library(llvmlite SHARED assembly.cpp bitcode.cpp core.cpp initfini.cpp
             custom_passes.cpp orcjit.cpp)

 # Link compiler-rt whole archive
+#set_target_properties(llvmlite PROPERTIES
+ #   LINK_FLAGS " -Wl,--whole-archive,${COMPILER_RT_LOCATION},--no-whole-archive ")
+
+target_link_libraries(llvmlite ${COMPILER_RT_LOCATION})
 set_target_properties(llvmlite PROPERTIES
-    LINK_FLAGS "-Lfoobar -Wl,--whole-archive,${COMPILER_RT_LOCATION},--no-whole-archive ")
+    LINK_FLAGS " /WHOLEARCHIVE:${COMPILER_RT_LOCATION} ")

However, when I dump the symbols of the generated libllvmlite.lib and libllvmlite.dll, there is no symbol:

Dump of file C:\Users\khu\llvm\llvmlite\ffi\build\Release\llvmlite.dll

File Type: DLL

  Summary

       BE000 .data
      149000 .pdata
     217C000 .rdata
       A0000 .reloc
        1000 .rsrc
     2A8C000 .text

C:\Program Files (x86)\Microsoft Visual Studio\2019\Community>dumpbin /symbols   C:/Users/khu/llvm/llvmlite/ffi/build/Release/llvmlite.lib
Microsoft (R) COFF/PE Dumper Version 14.29.30151.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file C:\Users\khu\llvm\llvmlite\ffi\build\Release\llvmlite.lib

File Type: LIBRARY

COFF SYMBOL TABLE
000 010175C7 ABS    notype       Static       | @comp.id
001 00000000 SECT2  notype       External     | __IMPORT_DESCRIPTOR_llvmlite
002 C0000040 SECT2  notype       Section      | .idata$2
003 00000000 SECT3  notype       Static       | .idata$6
004 C0000040 UNDEF  notype       Section      | .idata$4
005 C0000040 UNDEF  notype       Section      | .idata$5
006 00000000 UNDEF  notype       External     | __NULL_IMPORT_DESCRIPTOR
007 00000000 UNDEF  notype       External     | ⌂llvmlite_NULL_THUNK_DATA

String Table Size = 0x54 bytes

COFF SYMBOL TABLE
000 010175C7 ABS    notype       Static       | @comp.id
001 00000000 SECT2  notype       External     | __NULL_IMPORT_DESCRIPTOR

String Table Size = 0x1D bytes

COFF SYMBOL TABLE
000 010175C7 ABS    notype       Static       | @comp.id
001 00000000 SECT2  notype       External     | ⌂llvmlite_NULL_THUNK_DATA

String Table Size = 0x1E bytes

  Summary

          C6 .debug$S
          14 .idata$2
          14 .idata$3
           8 .idata$4
           8 .idata$5
           E .idata$6

Shall we fix the Linux version first?

@oliverhu
Copy link
Author

oliverhu commented Sep 2, 2023

Additional debugging for macOS, got some help from llvm discourse: https://discourse.llvm.org/t/lld-automatically-hide-symbols-with-prefix/73192, nm -m shows the symbols are private external.

/Users/khu/miniconda3/envs/rt/lib/clang/14.0.6/lib/darwin/libclang_rt.osx.a(divtc3.c.o):
0000000000000000 (__TEXT,__text) private external ___divtc3
                 (undefined) external _logbl
                 (undefined) external _scalbnl

@oliverhu
Copy link
Author

oliverhu commented Sep 2, 2023

After checking out the compiler-rt source, It seems we need to turn off COMPILER_RT_BUILTINS_HIDE_SYMBOLS flag, and also need to fix an issue that symbol visibility was hard coded to be off in the Darwin cmake module (compiler-rt-14.0.6.src/cmake/Modules/CompilerRTDarwinUtils.cmake), something like this:

macro(darwin_add_builtin_libraries)
  set(DARWIN_EXCLUDE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/Darwin-excludes)

-set(CFLAGS "-fPIC -O3 -fvisibility=hidden -DVISIBILITY_HIDDEN -Wall -fomit-frame-pointer")
+set(CFLAGS "-fPIC -O3  -Wall -fomit-frame-pointer")
+append_list_if(COMPILER_RT_BUILTINS_HIDE_SYMBOLS -fvisibility=hidden -DVISIBILITY_HIDDEN CFLAGS)
  set(CMAKE_C_FLAGS "")
  set(CMAKE_CXX_FLAGS "")
  set(CMAKE_ASM_FLAGS "")

After the change:

❯ nm -m ./lib/darwin/libclang_rt.osx.a | grep ___divtc3
0000000000000000 (__TEXT,__text) external ___divtc3

@esc
Copy link
Member

esc commented Nov 15, 2023

@gmarkall @oliverhu -- wanted to add a label to this. What state is it in?

@gmarkall
Copy link
Member

@esc It's waiting on me - I've been struggling to find the time to get back to progressing with it.

@gmarkall
Copy link
Member

(It's waiting for #979, which is also waiting for me)

@esc
Copy link
Member

esc commented Nov 16, 2023

@gmarkall ok, thank you for the info. I have updated the labels accordingly.

gmarkall added a commit to gmarkall/llvmlite that referenced this pull request Apr 15, 2024
The symbols in the compiler-rt builtins library are private external by
default on MacOS. We need them to be visible so we can link against them
(e.g. when building libllvmlite.so). This patch makes the symbols
visible when the CMake option `COMPILER_RT_BUILTINS_HIDE_SYMBOLS` is
off. Note that this is an existing option for the compiler-rt build, and
the flags requiring modification in this patch appear to be an
oversight.

References:

- https://discourse.llvm.org/t/lld-automatically-hide-symbols-with-prefix/73192
- numba#986 (comment)
@mr-c
Copy link

mr-c commented Jul 6, 2024

Hello, given that #979 is closed via #1036 ; I was wondering if there was an update on this. Thanks!

@gmarkall
Copy link
Member

Hello, given that #979 is closed via #1036 ; I was wondering if there was an update on this. Thanks!

We meed #1067 to get merged so that all llvmlite builds are using an llvmdev package that includes the compiler-rt builtins, then I can bump this PR up to date and hope to proceed with it.

@gmarkall
Copy link
Member

Testing in #1072.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants