Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ppc64(le): Add an option to use IEEE long double ABI on Linux #4833

Merged
merged 15 commits into from
Feb 20, 2025

Conversation

liushuyu
Copy link
Contributor

@liushuyu liushuyu commented Feb 1, 2025

This pull request adds an option to use IEEE long double ABI on Linux (if the host environment supports it).

Some adjustments are made to accommodate the new ABI (glibc uses dual-ABI in this case, where new IEEE long double-capable functions are suffixed with __ieee128; older functions using IBM long double are also kept for compatibility reasons).

@liushuyu
Copy link
Contributor Author

liushuyu commented Feb 1, 2025

During the porting process, I discovered what seems to be an LLVM bug when targeting ppc64 using the IEEE long double ABI. Consider the following D code:

extern (C++) real test1(real arg0)
{
  if (!(arg0 == 0.0))
  {
    return 0.0;
  }
  else
  {
    return 0.0;
  }
}

LDC will generate the following LLVM IR (reduced, no optimization):

  target datalayout = "e-m:e-Fn32-i64:64-i128:128-n32:64-S128-v256:256:256-v512:512:512"
  target triple = "powerpc64le-unknown-linux-gnu"

  define fp128 @_Z5test1u9__ieee128(fp128 %0) {
    %2 = fcmp ogt fp128 %0, 0xL00000000000000000000000000000000
    %3 = icmp i1 %2, false
    br i1 %3, label %5, label %4

  4:                                                ; preds = %1
    ret fp128 0xL00000000000000000000000000000000

  5:                                                ; preds = %1
    ret fp128 0xL00000000000000000000000000000000
  }

... which will lead to what seems to be an infinite ppc-isel expansion loop inside LLVM.

I have added a workaround solution in this pull request to replace icmp .., false with xor .., true when lowering a "not" operator.

@JohanEngelen
Copy link
Member

During the porting process, I discovered what seems to be an LLVM bug when targeting ppc64 using the IEEE long double ABI.

Please also report the bug in LLVM's bug tracker.

Nice that you are working on this btw !

@@ -570,6 +570,8 @@ version (LDC)

version (AArch64) version = CheckFiberMigration;

version (PPC64) version = CheckFiberMigration;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intended change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is intended change. I fixed this after running the basic unit tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to my last comment, that issue was not discovered because the D Phobos library needs IEEE 128-bit arithmetic to work, which was not implemented before this pull request.

@JohanEngelen
Copy link
Member

would be good to add a lit testcase for setting this abi param

@liushuyu
Copy link
Contributor Author

liushuyu commented Feb 2, 2025

would be good to add a lit testcase for setting this abi param

Will do

@liushuyu
Copy link
Contributor Author

liushuyu commented Feb 4, 2025

During the porting process, I discovered what seems to be an LLVM bug when targeting ppc64 using the IEEE long double ABI.

Please also report the bug in LLVM's bug tracker.

Nice that you are working on this btw !

Thanks! I have proposed a fix to the LLVM upstream regarding this issue: llvm/llvm-project#125776

@liushuyu
Copy link
Contributor Author

liushuyu commented Feb 6, 2025

Corresponding DMD pull request: dlang/dmd#20826

@liushuyu liushuyu force-pushed the ppc64-d-ieee754-fix-new branch from 2361ed6 to a644c57 Compare February 14, 2025 22:35
@liushuyu liushuyu marked this pull request as ready for review February 15, 2025 21:08
@liushuyu
Copy link
Contributor Author

I think this is ready for review. I found some additional issues with druntime and phobos during testing, but the compiler works very well.

#include "gen/irstate.h"
#include "gen/llvmhelpers.h"
#include "gen/tollvm.h"

using namespace dmd;

struct LongDoubleRewrite : ABIRewrite {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this rewrite. The D real values (Tfloat80 etc.) are already IR-emitted as doubledouble or IEEE quad by the changes in target.cpp. The compiler represents all compile-time floating-point values via real_t, which is the D host compiler's real on non-x86. Compile-time real_t and the target real diverging can easily happen during cross-compilation (e.g., x87 real_t for x86 compilers cross-compiling to quad real for Linux AArch64). There's according APFloat conversion happening for IR emission already. The resulting limitations, incl. dangers of compile-time over/underflows when cross-compiling to a target with greater real precision, are mentioned in https://wiki.dlang.org/Cross-compiling_with_LDC#Limitations.

@@ -4284,404 +4284,651 @@ else
double acos(double x);
///
float acosf(float x);
///
real acosl(real x);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this will be accepted upstream, this break-up of the default block, and so think it'd be better to add a new top-level block here at line 4281 with static if (PPCUseIEEE128). All the BSDs, MSVC, Bionic, uclibc etc. have their special-cases kept away in their separate blocks - more duplication, but less cluttering with special cases all over the place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this will be accepted upstream, this break-up of the default block, and so think it'd be better to add a new top-level block here at line 4281 with static if (PPCUseIEEE128). All the BSDs, MSVC, Bionic, uclibc etc. have their special-cases kept away in their separate blocks - more duplication, but less cluttering with special cases all over the place.

This is actually the version accepted upstream, you can see it here: dlang/dmd@9ae8b3e

version (D_PPCUseIEEE128)
enum has128BitCAS = true;
else
enum has128BitCAS = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum has128BitCAS = real.mant_dig == 113, then we don't need the extra predefined version anymore.

Please note that there's one catch with static if vs. a version(…) - the former is semantically analyzed later, versions are resolved really early. If unlucky, forward referencing errors can occur. But if druntime and the test runners can still be built successfully, we might get away without extra predefined version.

gen/target.cpp Outdated
@@ -240,6 +264,19 @@ const char *TargetCPP::typeMangle(Type *t) {
// `long double` on Android/x64 is __float128 and mangled as `g`
bool isAndroidX64 = triple.getEnvironment() == llvm::Triple::Android &&
triple.getArch() == llvm::Triple::x86_64;
if (triple.getArch() == llvm::Triple::ppc64 ||
triple.getArch() == llvm::Triple::ppc64le) {
if (global.params.ppcUseIEEE128 &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target.RealProperties.mant_dig == 113 should work as well

gen/target.cpp Outdated
triple.getEnvironment() == llvm::Triple::GNU) {
return "u9__ieee128";
}
if (size(t) == 16) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target.realsize, or another property

// special handling for ieeelongdouble options
// note that, it is expected if we did not see any previous mabi
// options, we reset mABI variable to empty
if (Arg == "ieeelongdouble") {
Copy link
Member

@kinke kinke Feb 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this would be clang-compatible, but still, we don't have long double in D, that's real.

I guess another option could be something like -real-precision=<double|quad>, for all targets, a cmdline option that we could embed in gen/target.cpp, as we only need it to choose/override the target real type (and then wouldn't need a global.params.ppcUseIEEE128). Only supporting the two IEEE variants, no x87 and doubledouble exotics. -real-precision=double could e.g. also come in handy for webassembly in #4838, as a way to deal with existing real code (people unfortunately use it in D much more often than long double in C...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this would be clang-compatible, but still, we don't have long double in D, that's real.

I guess another option could be something like -real-precision=<double|quad>, for all targets, a cmdline option that we could embed in gen/target.cpp, as we only need it to choose the target real type (and then wouldn't need a global.params.ppcUseIEEE128). Only supporting the two IEEE variants, no x87 and doubledouble exotics. -real-precision=double could e.g. also come in handy for webassembly in #4838, as a way to deal with existing real code (people unfortunately use it in D much more often than long double in C...).

Then how do we express -mabi=ibmlongdouble in the case of LDC? I know the idea might be use IEEE quad as much as possible, but I think at least in the case of POWER platforms, a compatibility escape hatch is still needed (a lot of Linux distros still default to IBM double double for their system C libraries).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, doubledouble is currently the default setting when targeting glibc, so one wouldn't have to override the default with -real-precision. But yeah, I guess it would make sense to default to IEEE quad when adding support for it now in 2025, or at least have the option to default to it at one point in the future. So yeah, -real-precision might have to include x87 and doubledouble too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, doubledouble is currently the default setting when targeting glibc, so one wouldn't have to override the default with -real-precision. But yeah, I guess it would make sense to default to IEEE quad when adding support for it now in 2025, or at least have the option to default to it at one point in the future. So yeah, -real-precision might have to include x87 and doubledouble too.

I have added a new option -real-precision=<double|quad|platform> for this (platform = platform-specific encoding, on x86 this would be x87 and on ppc64le this would be IBM double-double)

elseif ( NOT HAS_IBM_LONG_DOUBLE )
# usually the case for musl/uclibc
append("-mlong-double-64" LDC_CXXFLAGS)
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, you check the default behavior of the C++ compiler, so we shouldn't need any explicit C++ flags. [Unless you wanna override the LLVM flags.] The D host compiler might need a flag though, to make sure it's real matches the C++ long double, at least for the IEEE-quad case, which needs explicit opting it in with (new) LDC compilers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I admit that might be a mistake. I will fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's a bit more complicated than that - host druntime and Phobos need to be (pre)compiled with the same ABI setting, matching the C++/LLVM one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's a bit more complicated than that - host druntime and Phobos need to be (pre)compiled with the same ABI setting, matching the C++/LLVM one.

This one might need to be documented. According to my testing, changing the host druntime files in include directory is enough to get new LDC bootstrapped (using GDC/GDMD).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But as adding some D flag wouldn't be sufficient then anyway, we probably don't need to try to add some here in CMake anymore. Host C++ and D compilers need to target the same ABI when building LDC; if the D one needs tweaking via extra flag, host druntime and Phobos need to be built accordingly and selected too, as in a cross-compile scenario.

[You were probably lucky that a few binding adaptations in the source files were enough to fix up host druntime, without having to rebuild the library.]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. CMake files fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You now affect the new druntime and Phobos builds for the just-built LDC, compiling them with the same ABI setting as the compiler itself. It doesn't affect (and cannot) host druntime and Phobos (from GDC in your case), which need to be precompiled with the same ABI setting (as they are linked into the LDC executable).

Now suppose LDC was built on a PPC system with (default) IEEE quad ABI. New druntime and Phobos would be compiled with -real-precision=quad. But when the user runs it with ldc2 hello.d, he'd get a hello object file compiled with default doubledouble ABI, linked with druntime and Phobos using the IEEE ABI. As the D real mangling isn't affected by its precision, the user wouldn't get undefined symbols, but most likely just corrupt floating-point values at runtime.

So a distro package maintainer would in that case need to make sure the compiler defaults to the same ABI as the bundled precompiled druntime and Phobos. E.g., by adding a PPC section in ldc2.conf, adding -real-precision=quad as default switch.

So as said, I think it'd be best to just remove all of this CMake flag fiddling for PPC - it's not enough, the user/package maintainer has to get involved and provide explicit flags in case the default host compiler ABIs diverge, and/or a non-default ABI is desired. As long as using GDC as host compiler, the compilers most likely default to the same ABI. [And there's probably a long way to go until LDC can build itself on such platforms, that requires full C ABI compatibility (ABI rewrites to help LLVM do the right thing) and full C-style variadic arguments support.]

Copy link
Member

@kinke kinke Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we could also do: for a native ppc compiler build, default to the real precision of the host. So a native IEEE-quad build would default to -real-precision=quad (incl. druntime and Phobos automatically precompiled with that ABI setting).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I have now put these inside ldc2*.conf files.

@liushuyu liushuyu force-pushed the ppc64-d-ieee754-fix-new branch from a644c57 to 739cd9b Compare February 16, 2025 02:15
@liushuyu
Copy link
Contributor Author

FreeBSD CI is broken because Google Cloud no longer has the 13.3 image (the oldest supported image is FreeBSD 13.4).

@liushuyu liushuyu force-pushed the ppc64-d-ieee754-fix-new branch 2 times, most recently from b485bdc to a413d1f Compare February 16, 2025 20:17
... the system is using the new ABI that supports IEEE 754R long double
instead of the legacy IBM double double format
@liushuyu liushuyu force-pushed the ppc64-d-ieee754-fix-new branch from a413d1f to 96c455c Compare February 16, 2025 20:36
@liushuyu
Copy link
Contributor Author

Also GitHub says: actions/runner-images#11101

{
// default to IEEE quad precision
// if your platform does not support this, feel free to remove it.
switches = ["--real-precision=quad"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrays are not cumulative in ldc2.conf so what you did removed -defaultlib=druntime-ldc from the default switches for ppc64le

@kinke
Copy link
Member

kinke commented Feb 18, 2025

This is how I imagine it, without being able to test anything: #4840

@kinke kinke merged commit 96c455c into ldc-developers:master Feb 20, 2025
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants