Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional float types #3451

Closed
wants to merge 3 commits into from
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 139 additions & 0 deletions text/3451-additional-float-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
- Feature Name: `additional-float-types`
- Start Date: 2023-6-28
- RFC PR: [rust-lang/rfcs#3451](https://github.com/rust-lang/rfcs/pull/3451)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

This RFC proposes new floating point types `f16` and `f128` into core language and standard library. Also this RFC introduces `f80`, `doubledouble`, `bf16` into `core::arch` for inter-op with existing native code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This RFC proposes new floating point types `f16` and `f128` into core language and standard library. Also this RFC introduces `f80`, `doubledouble`, `bf16` into `core::arch` for inter-op with existing native code.
This RFC proposes new floating point types `f16` and `f128` into core language and standard
library. Also, this RFC introduces `f80`, `doubledouble`, and `bf16` into `core::arch` for
target-specific support, and `core::ffi::c_longdouble` for FFI interop.

Copy link

@lygstate lygstate Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest use these symbols, all start with f prefix that consistence

  • f128: C _Float128, LLVM fp128, GCC __float128
  • f16: C _Float16, LLVM half, GCC __fp16
  • f16b: C++ std::bfloat16_t, LLVM bfloat, GCC __bf16
  • f80e: LLVM x86_fp80, GCC __float80
  • f64f64: LLVM ppc_fp128, GCC __ibm128

And these symbols can be used as literal suffix as is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for f16b in favor over bf16 for consistency, I liked that about @joshtriplett's original proposal.

I don't think we should introduce something like f128x - doubledouble or something like the GCC or LLVM types would be better IMO. Reason being, it's kind of ambiguous and specific to one architecture - PowerPC is even moving away from it. Better to give it an unambigous name since it will be used relatively rarely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should use bf16 rather than f16b since that is widely recognized whereas f16b isn't, and f64_f64 instead of f128x since it really is 2 f64 values and could be easily emulated on any other architecture (do not use f64x2 since that's already used by Simd<f64, 2>). also f<N>x names are more or less defined by IEEE 754 to be wider than N bits, so e.g. f64x would be approximately any type wider than f64 but less than f128 such as f80, f16x could be the f24 type used by some GPUs for depth buffers. so logically f80x would need to be more than 80 bits and f128x would need to be more than 128 bits.

Copy link

@lygstate lygstate Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for f16b in favor over bf16 for consistency, I liked that about @joshtriplett's original proposal.

I don't think we should introduce something like f128x - doubledouble or something like the GCC or LLVM types would be better IMO. Reason being, it's kind of ambiguous and specific to one architecture - PowerPC is even moving away from it. Better to give it an unambigous name since it will be used relatively rarely.

f64f64 is OK, no need the underscore looks like doubledouble, f80e instead f80x if f80x is not acceptable

Still vote for f16b, It's rust specific, we can create relationship between bf16 and f16b in rust, that's won't be a burden.

Copy link
Contributor

@aaronfranke aaronfranke Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about f64x2 to indicate it's two f64 glued together?

Copy link

@lygstate lygstate Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use f64x2 since that's already used by Simd<f64, 2>


# Motivation
[motivation]: #motivation

IEEE-754 standard defines binary floating point formats, including binary16, binary32, binary64 and binary128. The binary32 and binary64 correspond to `f32` and `f64` types in Rust, while binary16 and binary128 are used in multiple scenarios (machine learning, scientific computing, etc.) and accepted by some modern architectures (by software or hardware).
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

In C/C++ world, there're already types representing these formats, along with more legacy non-standard types specific to some platform. Introduce them in a limited way would help improve FFI against such code.
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

`f16` and `f128` are primitive floating types, they can be used just like `f32` or `f64`. They always conform to binary16 and binary128 format defined in IEEE-754, which means size of `f16` is always 16-bit, and size of `f128` is always 128-bit.
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

```rust
let val1 = 1.0; // Default type is still f64
let val2: f128 = 1.0;
let val3: f16 = 1.0;
let val4 = 1.0f128; // Suffix of f128 literal
let val5 = 1.0f16; // Suffix of f16 literal

println!("Size of f128 in bytes: {}", std::mem::size_of_val(&val2)); // 16
println!("Size of f16 in bytes: {}", std::mem::size_of_val(&val3)); // 2
```

Because not every target supports `f16` and `f128`, compiler provides conditional guards for them:
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

```rust
#[cfg(target_has_f128)]
fn get_f128() -> f128 { 1.0f128 }

#[cfg(target_has_f16)]
fn get_f16() -> f16 { 1.0f16 }
```

All operators, constants and math functions defined for `f32` and `f64` in core, are also defined for `f16` and `f128`, and guarded by respective conditional guards.

`f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation.
The `f80` type is defined in `core::arch::{x86, x86_64}` as 80-bit extended precision. The `doubledouble`
type is defined in `core::arch::{powerpc, powerpc64}` and represent's PowerPC's non-IEEE double-double
format (two `f64`s used to aproximate `f128`). `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}` and represents the "brain" float, a truncated `f32` with SIMD support on some hardware. These
types do not have literal representation.
When working with FFI, the `core::ffi::c_longdouble` type can be used to match whatever type
`long double` represents in C.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I did not add the mention of longdouble yet. More things need to be clarified:

  • Is there always only 1 long double for each (arch, abi, os) tuple? For example, powerpc64le-unknown-linux-gnu can use either double or doubledouble or IEEE binary128 as long double by -mabi=(ieee|ibm)longdouble and -mlong-double-(64|128).
  • Is mangling of long double the same regardless of its underlying semantics?
  • Some targets (also powerpc64le for example) support .gnu_attribute, so that linker can differentiate objects compiled by different long double ABI. Should Rust programs using c_longdouble emit such attribute?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bf16 is supported on a wide range of newer architectures, such as powerpc, x86, arm, and (WIP) risc-v. imho it should not be classified as architecture-specific but instead more like f16

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap bf16 should be simulated when target arch is not supported

Copy link
Member

@RalfJung RalfJung Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For people that have never heard of bf16 or doubledouble (which I assume are 16 and 128 bits in size, respectively), it would be good to link to some sort of document explaining them, and how they differ from f16 and f128, respectively.

Also the RFC needs to say what their semantics are, if IEEE doesn't specify them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nobody seems to agree on bf16 semantics:
arm has both round as normal with subnormals supported and round to odd with subnormals not supported.
x86 has round to nearest with subnormals not supported.
powerpc has round as normal with subnormals supported.
all isas have round towards zero with subnormals supported (just f32::to_bits(v) >> 16).


# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

## `f16` type

`f16` consists of 1 bit of sign, 5 bits of exponent, 10 bits of mantissa.
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

The following `From` and `TryFrom` traits are implemented for conversion between `f16` and other types:

```rust
impl From<f16> for f32 { /* ... */ }
impl From<f16> for f64 { /* ... */ }
impl From<bool> for f16 { /* ... */ }
impl From<u8> for f16 { /* ... */ }
impl From<i8> for f16 { /* ... */ }
```

`f16` will generate `half` type in LLVM IR.

## `f128` type

`f128` consists of 1 bit of sign, 15 bits of exponent, 112 bits of mantissa.
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

`f128` is available for on targets having (1) hardware instructions or software emulation for 128-bit float type; (2) backend support for `f128` type on the target; (3) essential target features enabled (if any).

The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`.
Copy link
Contributor

@tgross35 tgross35 Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`.
The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`.
`x86_64-*` and `aarch64-*`

I don't know for sure what targets support it, but we should aim to at least support the major 64-bit CPUs
here at first

There is also a risc target per @aaronfranke here #2629 (comment) but I'm not sure how rv64gQc maps to our riscv64gc


The following traits are also implemented for conversion between `f128` and other types:

```rust
impl From<f16> for f128 { /* ... */ }
impl From<f32> for f128 { /* ... */ }
impl From<f64> for f128 { /* ... */ }
impl From<bool> for f128 { /* ... */ }
impl From<u8> for f128 { /* ... */ }
impl From<i8> for f128 { /* ... */ }
impl From<u16> for f128 { /* ... */ }
impl From<i16> for f128 { /* ... */ }
impl From<u32> for f128 { /* ... */ }
impl From<i32> for f128 { /* ... */ }
impl From<u64> for f128 { /* ... */ }
impl From<i64> for f128 { /* ... */ }
```

`f128` will generate `fp128` type in LLVM IR.


`std::simd` defines new vector types with `f16` or `f128` element: `f16x2` `f16x4` `f16x8` `f16x16` `f16x32` `f128x2` `f128x4`.
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

For `doubledouble` type, conversion intrinsics are available under `core::arch::{powerpc, powerpc64}`. For `f80` type, conversion intrinsics are available under `core::arch::{x86, x86_64}`.

## Architectures specific types

As for non-standard types, `f80` generates `x86_fp80`, `doubledouble` generates `ppc_fp128`, `bf16` generates `bfloat`.
ecnelises marked this conversation as resolved.
Show resolved Hide resolved

# Drawbacks
[drawbacks]: #drawbacks

Unlike f32 and f64, although there are platform independent implementation of supplementary intrinsics on these types, not every target support the two types natively, with regards to the ABI. Adding them will be a challenge for handling different cases.
Comment on lines +117 to +120
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not every platform supports f32 and f64 natively either. For example, RISC-V without the F or D extensions (ex: ISA string of rv64i). This should be mentioned.

Whatever emulation Rust already does to support f32 and f64 on systems without native support should similarly happen to emulate f128 on systems without native quadruple-precision support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For riscv without hardware float support there is a defined soft-float ABI. There is not for f16/bf16. Same for x86_64. Many other architectures likely don't have a defined soft-float abi for f128 either. And as I understand it AArch64 doesn't have a soft-float abi at all as Neon support is mandatory and floats are even allowed inside the kernel unlike eg x86_64 where floats are disabled in the kernel to avoid having to save them on syscalls.


# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

There are some crates aiming for similar functionality:

- [f128](https://github.com/jkarns275/f128) provides binding to `__float128` type in GCC.
- [half](https://github.com/starkat99/half-rs) provides implementation of binary16 and bfloat16 types.

However, besides the disadvantage of usage inconsistency between primitive type and type from crate, there are still issues around those bindings.

The availablity of additional float types depends on CPU/OS/ABI/features of different targets heavily. Evolution of LLVM may also unlock possibility of the types on new targets. Implementing them in compiler handles the stuff at the best location.

Most of such crates defines their type on top of C binding. But extended float type definition in C is complex and confusing. The meaning of `long double`, `_Float128` varies by targets or compiler options. Implementing in Rust compiler helps to maintain a stable codegen interface.

And since third party tools also relies on Rust internal code, implementing additional float types in compiler also help the tools to recognize them.

# Prior art
[prior-art]: #prior-art

We have a previous proposal on `f16b` type to represent `bfloat16`: https://github.com/joshtriplett/rfcs/blob/f16b/text/0000-f16b.md

# Unresolved questions
[unresolved-questions]: #unresolved-questions

This proposal does not introduce `c_longdouble` type for FFI, because it means one of `f128`, `doubledouble`, `f64` or `f80` on different cases. Also for `c_float128`.

# Future possibilities
[future-possibilities]: #future-possibilities

More functions will be added to those platform dependent float types, like casting between `f128` and `doubledouble`.

For targets not supporting `f16` or `f128`, we may be able to introduce a 'limited mode', where the types are not fully functional, but user can load, store and call functions with such arguments.
ecnelises marked this conversation as resolved.
Show resolved Hide resolved