Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang/LLVM Toolchain Support #777

Open
8 of 12 tasks
stephanosio opened this issue Sep 6, 2024 · 12 comments
Open
8 of 12 tasks

Clang/LLVM Toolchain Support #777

stephanosio opened this issue Sep 6, 2024 · 12 comments
Assignees
Labels
area: Clang Issues related to Clang area: Toolchain Issues related to Toolchain (Binutils+GCC+GDB+libs) enhancement
Milestone

Comments

@stephanosio
Copy link
Member

stephanosio commented Sep 6, 2024

This enhancement issue describes the Clang/LLVM toolchain support plan in the Zephyr SDK.

Goals

  • Provide LLVM Binary Utilities as part of the SDK
  • Provide Clang C/C++ Compiler as part of the SDK
  • Support SDK Clang/LLVM toolchain in Zephyr

Specifications

  • LLVM Binary Utilities

    • LLVM Binary Utilities shall be an optional component of the SDK that is available separately from the SDK GNU Binutils.
    • The following LLVM Binary Utilities binaries shall be provided [1]:
      • dsymutil
      • lld
      • llvm-ar
      • llvm-cov
      • llvm-cxxfilt
      • llvm-dwarfdump
      • llvm-nm
      • llvm-objcopy
      • llvm-objdump
      • llvm-profdata
      • llvm-ranlib (-> llvm-ar)
      • llvm-readelf (-> llvm-readobj)
      • llvm-readobj
      • llvm-size
      • llvm-strip (-> llvm-objcopy)
      • llvm-symbolizer
      • wasm-ld (-> lld)
  • Clang Compiler

    • Clang Compiler shall be an optional component of the SDK that is available separately from the SDK GCC.
    • Clang Compiler shall be configured to C and C++ languages.
    • The following Clang compiler binaries shall be provided:
      • clang
      • clang++ (-> clang)
      • clang-cpp (->clang)
  • Pre-built Libraries

    • The following pre-built libraries shall be provided:
      • compiler-rt
      • picolibc
      • libunwind
      • libc++abi
      • libc++
    • The pre-built libraries shall be provided for the following Arm targets [2]:
      • armv6m_soft_nofp
      • armv7em_hard_fpv4_sp_d16
      • armv7em_hard_fpv5_d16
      • armv7em_soft_nofp
      • armv7m_soft_fpv4_sp_d16
      • armv7m_soft_nofp
      • armv8.1m.main_hard_fp_nomve
      • armv8.1m.main_hard_fpdp_nomve
      • armv8.1m.main_hard_nofp_mve
      • armv8.1m.main_hard_nofp_nomve
      • armv8.1m.main_hard_fp
      • armv8.1m.main_soft_nofp
    • The pre-built libraries shall be provided for the following RISC-V targets [3]:
      • rv32e_zicsr_zifencei/ilp32e
      • rv32em_zicsr_zifencei/ilp32e
      • rv32emc_zicsr/ilp32e
      • rv32emc_zicsr_zba_zbb_zbc_zbs/ilp32e
      • rv32emc_zicsr_zifencei/ilp32e
      • rv32emc_zicsr_zifencei_zba_zbb_zbc_zbs/ilp32e
      • rv32i_zicsr_zifencei/ilp32
      • rv32if_zicsr_zifencei/ilp32
      • rv32im_zicsr_zifencei/ilp32
      • rv32im_zicsr_zifencei_zba_zbb_zbc_zbs/ilp32
      • rv32imac_zicsr_zifencei/ilp32
      • rv32imafc_zicsr_zifencei/ilp32
      • rv32imafd_zicsr_zifencei/ilp32
      • rv32imfc_zicsr_zifencei/ilp32
  • Zephyr Build System Integration

    • The SDK Clang/LLVM toolchain shall be selectable by setting ZEPHYR_TOOLCHAIN_VARIANT=zephyr-llvm.
      • The existing ZEPHYR_TOOLCHAIN_VARIANT=zephyr (aka. GCC) shall be renamed to ZEPHYR_TOOLCHAIN_VARIANT=zephyr-gnu.
      • ZEPHYR_TOOLCHAIN_VARIANT=zephyr shall map to ZEPHYR_TOOLCHAIN_VARIANT=zephyr-gnu for the foreseeable in order to ensure compatibility.
    • The zephyr-llvm toolchain shall invoke Clang compiler with LLVM Linker [4]:
      • Build system configuration: COMPILER=clang, BINTOOLS=llvm, LINKER=lld
  • SDK Distribution

    • Distribution Archive
      • The existing "GNU toolchain distribution archive" (toolchain_HOST-TARGET) shall be split into "GNU Binutils distribution archive" (toolchain_gnu-binutils_OS-TARGET) and "GCC distribution archive" (toolchain_gcc_HOST-TARGET) [5]. Binutils and GCC will be kept in the same "GNU" archive since Zephyr SDK LLVM support will default to using LLVM lld, not GNU ld.
      • A new distribution archive for LLVM Binary Utilities (toolchain_llvm-binutils_HOST) shall be added.
      • A new distribution archive for Clang Compiler (toolchain_clang-base_HOST) shall be added.
      • A new distribution archive for Clang Pre-built Libraries (toolchain_clang-lib_HOST-TARGET) shall be added.
    • Distribution Bundle
      • The existing "GNU toolchain distribution bundle" (zephyr-sdk-VER_HOST) shall be renamed to zephyr-sdk-VER-gcc_HOST.
      • A new distribution bundle consisting of Clang/LLVM toolchain (zephyr-sdk-VER-clang_HOST) shall be added.
      • The distribution bundle installation script shall be updated to support installation of individual GNU Binutils, LLVM Binutils, GCC, Clang components for the minimal distribution bundle.
  • Clang/LLVM Toolchain Build Process

    • A custom build script shall be implemented for building Clang/LLVM toolchain [6].

[1] Based on the "LLVM Binary Utilities" included in LLVM Embedded Toolchain for Arm.
[2] Based on the pre-built multi-libs included in LLVM Embedded Toolchain for Arm.
[3] Based on the pre-built 32-bit RISC-V multi-libs currently included in the SDK GCC.
[4] LLVM Linker (lld) support is currently very limited and it is desirable to use GNU Linker (ld) for maximum compatibility. Note that GNU Linker is used by default for ZEPHYR_TOOLCHAIN_VARIANT=llvm as well. lld support seems to be sufficiently mature now.
[5] This ensures that GNU Linker, which is part of GNU Binutils, can be installed alongside Clang/LLVM toolchain without installing GCC.
[6] Ideally, we would implement Clang/LLVM toolchain support in crosstool-ng and upstream it; but, doing so will likely delay this task too much for our liking -- we will see.

Tasks

Phase 1

Inclusion of Clang/LLVM toolchain binaries and pre-built compiler-rt library for Arm-M-profile cores and RISC-V RV32I and RV32E cores in the SDK

  • Implement LLVM binary utilities build process
  • Implement Clang compiler build process
  • Implement compiler-rt multi-lib build process for Arm
  • Implement compiler-rt multi-lib build process for RISC-V
  • Rework the distribution archive build process as per the "SDK Distribution" specifications above
  • Rework the distribution bundle build process as per the "SDK Distribution" specifications above

Phase 2

Addition of pre-built C/C++ libraries for Arm M-profile cores and RISC-V RV32I and RV32E cores to the SDK

  • Implement Picolibc multi-lib build process
  • Implement libunwind multi-lib build process
  • Implement libc++abi multi-lib build process
  • Implement libc++ multi-lib build process

Future

(Nothing concrete about these ...)

  • Implement Clang/LLVM toolchain support in crosstool-ng and refactor the SDK build process to use it
  • Add pre-built libraries for other architectures (i.e. other than Arm and RISC-V)

Resources

@stephanosio stephanosio added enhancement area: Toolchain Issues related to Toolchain (Binutils+GCC+GDB+libs) area: Clang Issues related to Clang labels Sep 6, 2024
@stephanosio stephanosio self-assigned this Sep 6, 2024
@stephanosio
Copy link
Member Author

FYI @carlescufi @tejlmand

@dkalowsk
Copy link

dkalowsk commented Sep 6, 2024

@stephanosio are there expected to be Zephyr specific changes to the LLVM build? If so, would it make sense to include libclang to allow developers to build extensions on it?

@stephanosio
Copy link
Member Author

@stephanosio are there expected to be Zephyr specific changes to the LLVM build?

Not right now; but, it is foreseeable that we will need to carry some local patches in the future.

If so, would it make sense to include libclang to allow developers to build extensions on it?

That is an interesting topic -- if it does not add too much to the build time or the distribution archive size, I think libclang would be a reasonable addition.

@carlescufi
Copy link
Member

carlescufi commented Sep 10, 2024

Architecture WG:

  • @nashif asks about CI, and how we will manage with a new toolchain, given that we are already stretched with GCC only
  • @tejlmand states that adding the toolchain to the SDK is not technically making anything different from what we have today.
  • @nashif: we need to ensure cross-toolchain compatibility with all features of Zephyr. No cross-arch features that are limited to a single toolchain
  • @cfriedt suggests using the new clang distribution to try and solve the native_sim problem on macOS

@jonathonpenix
Copy link

I'm super excited to see this--thank you @stephanosio for creating this! This looks great to me, though I do have a few questions I'm hoping to run by you:

  1. Could we consider including AArch64 support as part of this (at least a very basic config similar to what is included in the LLVM Embedded Toolchain for Arm)? As best I understand, all the Arm library variants listed above are 32bit. AArch64 is well supported in LLVM and Zephyr has some baseline support for LLVM + AArch64 as well.
  2. Are you targeting a particular version of LLVM to start with? I think 18 is the most recent official release, but LLVM has branched 19.1.0-rc4 and I think is getting close to branching 19.1.0-final--so, 19 might be an option soon as well. When I've tested locally, I have seen a few issues that were fixed in LLVM's mainline/later releases--it likely isn't a huge deal either way, but might be something to keep in mind.
  3. Related to 2 above, LLVM's multilib support might also be a challenge here. LLVM's YAML solution was added for baremetal Arm/AArch64 a few releases ago (and is still seeing active changes) but this was only very recently extended to RISC-V (llvm/llvm-project@b221c37, ~1 month ago)--I don't think robust multilib support will be in place in LLVM for RISC-V unless we use a mainline build of LLVM or we wait for the next release. Historically, there was only very minimal hard-coded RISC-V "multilib support" (see here in the 19 release sources, for example). So, if LLVM <= 19 is used to start with, maybe it would be worth including only a minimal subset of the RISC-V library variants to begin with and expanding this later when we have access to better multilib capabilities?
  4. Related to "A custom build script shall be implemented for building Clang/LLVM toolchain" and "Implement Clang/LLVM toolchain support in crosstool-ng and refactor the SDK build process to use it": I know you mentioned that this isn't concrete (and I'm not familiar with crosstool-ng so I'm not sure how much work this would be or if there is any LLVM support already in place), but IIRC one thing that was mentioned in one of the recent LLVM RISC-V or Embedded toolchain meetings was whether there'd be any interest in essentially an "LLVM Embedded Toolchain for Arm" but extended to support building for RISC-V as well. Would this be something that you/Zephyr would be interested in using to build the toolchain? Or, is crosstool-ng (or other scripts, etc.) preferred long-term? If you/Zephyr would be interested, I'd be interested in discussing more and looping in a few LLVM folks!

@stephanosio
Copy link
Member Author

1. Could we consider including AArch64 support as part of this (at least a very basic config similar to what is included in the LLVM Embedded Toolchain for Arm)? As best I understand, all the Arm library variants listed above are 32bit. AArch64 is well supported in LLVM and Zephyr has some baseline support for LLVM + AArch64 as well.

Yes, I am planning to also include the AArch64 support if there is no major obstacle in doing so (AFAICS, there should be none).

2. Are you targeting a particular version of LLVM to start with? I think 18 is the most recent official release, but LLVM has branched 19.1.0-rc4 and I think is getting close to branching 19.1.0-final--so, 19 might be an option soon as well. When I've tested locally, I have seen a few issues that were fixed in LLVM's mainline/later releases--it likely isn't a huge deal either way, but might be something to keep in mind.

I am planning to start with LLVM 18 for now, mainly because that is what the latest release of LLVM Embeeded Toolchain for Arm has and I will be implementing the SDK build script based on that.

3. Related to 2 above, LLVM's multilib support might also be a challenge here. LLVM's YAML solution was added for baremetal Arm/AArch64 a few releases ago (and is still seeing active changes) but this was only very recently extended to RISC-V (llvm/llvm-project@b221c37, ~1 month ago)--I don't think robust multilib support will be in place in LLVM for RISC-V unless we use a mainline build of LLVM or we wait for the next release. Historically, there was only very minimal hard-coded RISC-V "multilib support" (see here in the 19 release sources, for example). So, if LLVM <= 19 is used to start with, maybe it would be worth including only a minimal subset of the RISC-V library variants to begin with and expanding this later when we have access to better multilib capabilities?

At first glance, writing a patch to support the additional variants required by the Zephyr SDK (especially, the RV32E variants) looks like it should be fairly simple -- we can carry that patch in the Zephyr fork of LLVM until a new LLVM release with more robust RISC-V multi-lib support is available.

If that turns out to be not so simple for whatever reason, I will have to look into upgrading to a newer LLVM codebase or even brute-forcing the Zephyr build system to manually handle the RISC-V multi-lib variants ...

4. Related to "A custom build script shall be implemented for building Clang/LLVM toolchain" and "Implement Clang/LLVM toolchain support in crosstool-ng and refactor the SDK build process to use it": I know you mentioned that this isn't concrete (and I'm not familiar with crosstool-ng so I'm not sure how much work this would be or if there is any LLVM support already in place), but IIRC one thing that was mentioned in one of the recent LLVM RISC-V or Embedded toolchain meetings was whether there'd be any interest in essentially an "LLVM Embedded Toolchain for Arm" but extended to support building for RISC-V as well. Would this be something that you/Zephyr would be interested in using to build the toolchain? Or, is crosstool-ng (or other scripts, etc.) preferred long-term? If you/Zephyr would be interested, I'd be interested in discussing more and looping in a few LLVM folks!

crosstool-ng is a very popular tool for building embedded cross compiler toolchains, and that is what Zephyr SDK build system currently uses to build the GNU toolchains.

crosstool-ng is nice because it abstracts away all the tedious processes involved in building cross-compiler toolchains (e.g. builder OS -specific dependencies, host OS-specific dependencies, target dependencies ...) and allows one to build a toolchain from a Kconfig-based "recipe."

While it may not sound like much at first, it can be very useful when you are setting up a complex toolchain build environment such as one for a Canadian cross compiler (e.g. building aarch64-linux host toolchain from x86_64-linux build machine for riscv64-elf target).

While "LLVM Embedded Toolchain for RISC-V" that comes with custom scripts to build general purpose RISC-V LLVM toolchain sounds interesting, IMHO, having the ability to build any flavour of LLVM you want from a Kconfig-based recipe using crosstool-ng would be nicer.

@jonathonpenix
Copy link

Yes, I am planning to also include the AArch64 support if there is no major obstacle in doing so (AFAICS, there should be none).

Awesome! Apologies if I overlooked that somewhere.

I am planning to start with LLVM 18 for now, mainly because that is what the latest release of LLVM Embeeded Toolchain for Arm has and I will be implementing the SDK build script based on that.

Sounds good!

At first glance, writing a patch to support the additional variants required by the Zephyr SDK (especially, the RV32E variants) looks like it should be fairly simple -- we can carry that patch in the Zephyr fork of LLVM until a new LLVM release with more robust RISC-V multi-lib support is available.

If that turns out to be not so simple for whatever reason, I will have to look into upgrading to a newer LLVM codebase or even brute-forcing the Zephyr build system to manually handle the RISC-V multi-lib variants ...

Also sounds good! I think you're right that it should be fairly simple to add a patch to support additional variants but I also can't say I've tried it.

While "LLVM Embedded Toolchain for RISC-V" that comes with custom scripts to build general purpose RISC-V LLVM toolchain sounds interesting, IMHO, having the ability to build any flavour of LLVM you want from a Kconfig-based recipe using crosstool-ng would be nicer.

I see, makes sense! I don't have any particular objection here, it had just come up in LLVM's community a few times recently so I thought I would ask 🙂

Thank you again for proposing this and I'm excited to see this happen!

@apazos
Copy link

apazos commented Sep 12, 2024

Great initiative, @stephanosio, to add on to @jonathonpenix's list, do you plan to include RISC-V 64 bit targets as well? Including Arm 32 and 64-bit targets, and RISC-V 32 and 64-bit targets will be useful. Also, any plans to also include hard float support?

@stephanosio stephanosio added this to the 0.17.0 milestone Sep 19, 2024
@stephanosio stephanosio modified the milestones: 0.17.0, 0.18.0 Oct 3, 2024
@keith-packard
Copy link
Collaborator

I'm working on llvm toolchain support in picolibc by building with the LLVM embedded toolchain for Arm. I've found a few compatibility issues with clang and llvm-as, along with some differences in behavior between compiler-rt and libgcc (some look like bugs in compiler-rt to me). In any case, picolibc will be using that toolchain in CI going forward. Once there's a Zephyr clang toolchain, I'll add that to picolibc CI as well. picolibc/picolibc#856

@stephanosio
Copy link
Member Author

stephanosio commented Oct 24, 2024

A preliminary test build from topic-clang branch is available with AArch64 and ARM (32-bit) multi-libs. No RISC-V multi-libs are available at this time.

To test, follow the steps below:

  1. Download https://github.com/zephyrproject-rtos/sdk-ng/releases/download/v0.18.0-alpha1/zephyr-sdk-0.18.0-alpha1_linux-x86_64_llvm.tar.xz.
  2. Extract zephyr-sdk-0.18.0-alpha1_linux-x86_64_llvm.tar.xz.
  3. Run zephyr-sdk-0.18.0-alpha1/setup.sh -h.
  4. Set environment:
    export ZEPHYR_TOOLCHAIN_VARIANT=zephyr-llvm
    export ZEPHYR_SDK_INSTALL_DIR=/where-you-extracted/zephyr-sdk-0.18.0-alpha1
    
  5. Check out collab-sdk-0.18-dev branch.
  6. Build something.

@keith-packard
Copy link
Collaborator

I'm having some adventures getting this toolchain to build picolibc with --multilib=true. I looked at how sdk-ng is building picolibc and it's using --multilib=false and then hand-coding all of the multilib variants and directory paths. I think that is masking some bugs in the current llvm setup...

The multilib configuration file is supposed to map compiler args to library directories, but I'm seeing a difference between what that file contains and where the libraries are getting installed for risc-v:

$ clang -target riscv32-none-elf -march=rv32i -mabi=ilp32 -print-libgcc-file-name
/opt/zephyr-sdk-0.18.0-alpha1-3-g0476815/llvm/lib/clang/19/lib/riscv32-unknown-none-elf/libclang_rt.builtins.a

The /opt/zephyr-sdk-0.18.0-alpha1-3-g0476815/llvm/lib/clang/19/lib exists, but it is completely empty. Instead, the library is installed way over in llvm/lib/clang-runtimes/riscv32-none-elf/rv32i_zicsr_zifencei/lib/libclang_rt.builtins.a:

$ ls -l /opt/zephyr-sdk-0.18.0-alpha1-3-g0476815/llvm/lib/clang-runtimes/riscv32-none-elf/rv32i_zicsr_zifencei/lib/libclang_rt.builtins.a
-rw-r--r-- 1 keithp keithp 292696 Nov 23 09:58 /opt/zephyr-sdk-0.18.0-alpha1-3-g0476815/llvm/lib/clang-runtimes/riscv32-none-elf/rv32i_zicsr_zifencei/lib/libclang_rt.builtins.a

There's also some weirdness when using --print-multi-lib. If I set --target=arm-none-eabi or --target=aarch64-none-elf, --print-multi-lib dumps all of the configurations, including aarch64, arm, riscv32 and riscv64 configs. If I use --target=riscv32-none-elf or --target=riscv64-none-elf, I only get the results for the specified target, and with completely different configurations.

Here's the riscv32 set using --target=arm-none-eabi:

$ clang --target=arm-none-eabi --print-multi-lib | grep riscv32
riscv32-none-elf/rv32i_zicsr_zifencei_exn_rtti;@-target=riscv32-unknown-none-elf
riscv32-none-elf/rv32i_zicsr_zifencei;@-target=riscv32-unknown-none-elf@fno-exceptions@fno-rtti
riscv32-none-elf/rv32e_zicsr_zifencei_exn_rtti;@-target=riscv32-unknown-none-elf
riscv32-none-elf/rv32e_zicsr_zifencei;@-target=riscv32-unknown-none-elf@fno-exceptions@fno-rtti

and here's the riscv32 set using --target=riscv32-none-elf:

$ clang --target=riscv32-none-elf --print-multi-lib
rv32i/ilp32;@march=rv32i@mabi=ilp32
rv32im/ilp32;@march=rv32im@mabi=ilp32
rv32iac/ilp32;@march=rv32iac@mabi=ilp32
.;@march=rv32imac@mabi=ilp32
rv32imafc/ilp32f;@march=rv32imafc@mabi=ilp32f

Picolibc doesn't (yet) support building for multiple targets in the same -Dmultilib=true tree, so your technique of having the SDK do -Dmultilib=false builds for each desired config is a reasonable plan (the alternative would be to perform a per target build using -Dmultilib=true, which I may attempt). However, I'd like to do upstream testing using the Zephyr SDK, and for that, I'll need -Dmultilib=true to work.

@stephanosio
Copy link
Member Author

stephanosio commented Nov 25, 2024

I'm having some adventures getting this toolchain to build picolibc with --multilib=true. I looked at how sdk-ng is building picolibc and it's using --multilib=false and then hand-coding all of the multilib variants and directory paths. I think that is masking some bugs in the current llvm setup...

You need this patch to get RISC-V multilib from multilib.yaml working. I will soon open a PR pulling that patch into topic-clang.

UPDATE: Done. #839 should pull that LLVM patch and also add the RISC-V multi-libs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Clang Issues related to Clang area: Toolchain Issues related to Toolchain (Binutils+GCC+GDB+libs) enhancement
Projects
None yet
Development

No branches or pull requests

6 participants