Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP8 ACLE specification #323

Merged
merged 2 commits into from
Sep 23, 2024
Merged

Conversation

momchil-velikov
Copy link
Contributor


name: Pull request
about: Technical issues, document format problems, bugs in scripts or feature proposal.


Thank you for submitting a pull request!

If this PR is about a bugfix:

Please use the bugfix label and make sure to go through the checklist below.

If this PR is about a proposal:

We are looking forward to evaluate your proposal, and if possible to
make it part of the Arm C Language Extension (ACLE) specifications.

We would like to encourage you reading through the contribution
guidelines
, in particular the section on submitting
a proposal
.

Please use the proposal label.

As for any pull request, please make sure to go through the below
checklist.

Checklist: (mark with X those which apply)

  • If an issue reporting the bug exists, I have mentioned it in the
    PR (do not bother creating the issue if all you want to do is
    fixing the bug yourself).
  • I have added/updated the SPDX-FileCopyrightText lines on top
    of any file I have edited. Format is SPDX-FileCopyrightText: Copyright {year} {entity or name} <{contact informations}>
    (Please update existing copyright lines if applicable. You can
    specify year ranges with hyphen , as in 2017-2019, and use
    commas to separate gaps, as in 2018-2020, 2022).
  • I have updated the Copyright section of the sources of the
    specification I have edited (this will show up in the text
    rendered in the PDF and other output format supported). The
    format is the same described in the previous item.
  • I have run the CI scripts (if applicable, as they might be
    tricky to set up on non-*nix machines). The sequence can be
    found in the contribution
    guidelines
    . Don't
    worry if you cannot run these scripts on your machine, your
    patch will be automatically checked in the Actions of the pull
    request.
  • I have added an item that describes the changes I have
    introduced in this PR in the section Changes for next
    release
    of the section Change Control/Document history
    of the document. Create Changes for next release if it does
    not exist. Notice that changes that are not modifying the
    content and rendering of the specifications (both HTML and PDF)
    do not need to be listed.
  • When modifying content and/or its rendering, I have checked the
    correctness of the result in the PDF output (please refer to the
    instructions on how to build the PDFs
    locally
    ).
  • The variable draftversion is set to true in the YAML header
    of the sources of the specifications I have modified.
  • Please DO NOT add my GitHub profile to the list of contributors
    in the README page of the project.

main/acle.md Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
@andrewcarlotti
Copy link

I'd prefer slightly different naming for the intrinsics and new types. Specifically:

  1. Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.? This would be more consistent with existing type names while still preserving the "modal" distinction. It also makes the type name more easily distinguishable from FPMR values (which use fpm_t).

  2. Can we drop the _fpm suffix from all the intrinsic names, and instead represent the modality in the type suffix (by replacing _f8 with _fm8 wherever it appears)?

Combining these, my proposal is to replace, for example,
float16x4_t vdot_lane_f16_f8_fpm(float16x4_t vd, fpm8x8_t vn, fpm8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
with
float16x4_t vdot_lane_f16_fm8(float16x4_t vd, floatm8x8_t vn, floatm8x8_t vm, __builtin_constant_p(lane), fpm_t fpm).

@momchil-velikov
Copy link
Contributor Author

  1. Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.?
  2. Can we drop the _fpm suffix from all the intrinsic names, and instead represent the modality in the type suffix (by replacing _f8 with _fm8 wherever it appears)?

I am in favour of both proposals.

@rsandifo-arm
Copy link
Contributor

This might have been mentioned already, but the new vector types should also be added to svset_neonq, svget_neonq and svdup_neonq.

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 1, 2024
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic.

From the PR#323:
```
ACLE defines the `__fpm8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the FPMR.

This patch is an attempt to the add the fpm8_t scalar type.
It has a parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format.
But maybe we should add another opaque type.

[1]  ARM-software/acle#323
@momchil-velikov
Copy link
Contributor Author

  1. Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.?
  2. Can we drop the _fpm suffix from all the intrinsic names, and instead represent the modality in the type suffix (by replacing _f8 with _fm8 wherever it appears)?

I am in favour of both proposals.

Coming up Soon(tm).

@momchil-velikov
Copy link
Contributor Author

This might have been mentioned already, but the new vector types should also be added to svset_neonq, svget_neonq and svdup_neonq.

My next step is to add intrinsics for the untyped SVE/SME instructions, that would include these too.

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 22, 2024
This patch adds these new vector sizes for neon:
fpm8x16_t and fpm8x8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
@momchil-velikov
Copy link
Contributor Author

I'd prefer slightly different naming for the intrinsics and new types. Specifically:

  1. Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.? This would be more consistent with existing type names while still preserving the "modal" distinction. It also makes the type name more easily distinguishable from FPMR values (which use fpm_t).

This part done.

main/acle.md Outdated Show resolved Hide resolved
@paulwalker-arm
Copy link

  1. Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.? This would be more consistent with existing type names while still preserving the "modal" distinction. It also makes the type name more easily distinguishable from FPMR values (which use fpm_t).

As an amendment that follows the scheme used when going from float16 -> bfloat16 what about mfloat8_t, mfloat8x16_t, svmfloat8_t with "m" meaning "modal"?

@momchil-velikov
Copy link
Contributor Author

momchil-velikov commented Jul 30, 2024

Things renamed according to the above naming scheme.

Copy link
Contributor

@rsandifo-arm rsandifo-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new feature macros should also be listed in the “Summary of predefined macros” section.

main/acle.md Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
tools/intrinsic_db/advsimd.csv Outdated Show resolved Hide resolved
tools/intrinsic_db/advsimd.csv Outdated Show resolved Hide resolved
tools/intrinsic_db/advsimd.csv Outdated Show resolved Hide resolved
tools/intrinsic_db/advsimd.csv Outdated Show resolved Hide resolved
main/acle.md Outdated Show resolved Hide resolved
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024
 `mfloat8_t`     | equivalent to `__mfp8` |

According to ACLE[1] proposal
[1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024
 `mfloat8_t`     | equivalent to `__mfp8` |

According to ACLE[1] proposal
[1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024
and fpm8x16_t to mfloat8x16_t

According to the ACLE[1]

[1]ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Aug 2, 2024
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
 `mfloat8_t`     | equivalent to `__mfp8` |
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the FPMR.

This patch is an attempt to the add the fpm8_t scalar type.
It has a parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format.
But maybe we should add another opaque type.

According to ACLE[1] proposal
[1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Aug 2, 2024
This patch adds these new vector sizes for neon:
mfloat8x16_t and mfloat8x8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Aug 2, 2024
This patch adds these new vector sizes for sve:
svmfloat8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
Lukacma added a commit to llvm/llvm-project that referenced this pull request Sep 25, 2024
This patch implements these intrinsics:

FSCALE SINGLE AND MULTI
``` 
  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

```
(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
Lukacma added a commit to llvm/llvm-project that referenced this pull request Sep 26, 2024
This patch implements following intrinsics:

```
float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm)	
float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm)
float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm)
float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm)
float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm)
```

as defined in ARM-software/acle#323

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
augusto2112 pushed a commit to augusto2112/llvm-project that referenced this pull request Sep 26, 2024
This patch implements these intrinsics:

FSCALE SINGLE AND MULTI
``` 
  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

```
(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
augusto2112 pushed a commit to augusto2112/llvm-project that referenced this pull request Sep 26, 2024
This patch implements following intrinsics:

```
float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm)	
float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm)
float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm)
float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm)
float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm)
```

as defined in ARM-software/acle#323

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Sep 27, 2024
This patch implements following intrinsics:

```
float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm)	
float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm)
float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm)
float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm)
float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm)
```

as defined in ARM-software/acle#323

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
puja2196 pushed a commit to puja2196/LLVM-tutorial that referenced this pull request Sep 30, 2024
This patch implements following intrinsics:

```
float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm)	
float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm)
float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm)
float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm)
float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm)
```

as defined in ARM-software/acle#323

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
puja2196 pushed a commit to puja2196/LLVM-tutorial that referenced this pull request Oct 2, 2024
This patch implements following intrinsics:

```
float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm)	
float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm)
float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm)
float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm)
float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm)
```

as defined in ARM-software/acle#323

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
This patch implements these intrinsics:

FSCALE SINGLE AND MULTI
``` 
  // Variants are also available for:
  // [_single_f32_x2], [_single_f64_x2],
  // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4]
  svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;

  // Variants are also available for:
  //  [_f32_x2], [_f64_x2],
  //  [_f16_x4], [_f32_x4], [_f64_x4]
  svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming

```
(cf. ARM-software/acle#323)

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
This patch implements following intrinsics:

```
float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm)	
float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm)
float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm)
float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm)
float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm)
```

as defined in ARM-software/acle#323

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 9, 2024
This patch adds these new vector sizes for sve:
svmfloat8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 9, 2024
This patch adds these new vector sizes for sve:
svmfloat8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 14, 2024
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the FPMR.

This patch is an attempt to the add the MFloat8_t scalar type.
It has a parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format.
But maybe we should add another opaque type.

[1]  ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 14, 2024
This patch adds these new vector sizes for neon:
mfloat8x16_t and mfloat8x8_t

According to the ARM ACLE PR#323[1].

 `mfloat8_t`     | equivalent to `__mfp8` |

[1]ARM-software/acle#323
CarolineConcatto added a commit to llvm/llvm-project that referenced this pull request Oct 17, 2024
This patch adds these new vector sizes for sve:
    svmfloat8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
bricknerb pushed a commit to bricknerb/llvm-project that referenced this pull request Oct 17, 2024
This patch adds these new vector sizes for sve:
    svmfloat8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 22, 2024
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the FPMR.

This patch is an attempt to the add the MFloat8_t scalar type.
It has a parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format.
But maybe we should add another opaque type.

[1]  ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 22, 2024
This patch adds these new vector sizes for neon:
mfloat8x16_t and mfloat8x8_t

According to the ARM ACLE PR#323[1].

 `mfloat8_t`     | equivalent to `__mfp8` |

[1]ARM-software/acle#323
EricWF pushed a commit to efcs/llvm-project that referenced this pull request Oct 22, 2024
This patch adds these new vector sizes for sve:
    svmfloat8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
CarolineConcatto added a commit to llvm/llvm-project that referenced this pull request Oct 23, 2024
This patch adds these new vector sizes for neon:
   mfloat8x16_t and mfloat8x8_t

    According to the ARM ACLE PR#323[1].

    [1] ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 23, 2024
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the FPMR.

This patch is an attempt to the add the MFloat8_t scalar type.
It has a parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format.
But maybe we should add another opaque type.

[1]  ARM-software/acle#323
CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Oct 24, 2024
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the FPMR.

This patch is an attempt to the add the MFloat8_t scalar type.
It has a parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format.
But maybe we should add another opaque type.

[1]  ARM-software/acle#323
CarolineConcatto added a commit to llvm/llvm-project that referenced this pull request Oct 25, 2024
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point
intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the
FPMR.

This patch is an attempt to the add the mfloat8_t scalar type. It has a
parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format. But
maybe we should add another opaque type.

[1]  ARM-software/acle#323
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
This patch adds these new vector sizes for neon:
   mfloat8x16_t and mfloat8x8_t

    According to the ARM ACLE PR#323[1].

    [1] ARM-software/acle#323
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…#97277)

ARM ACLE PR#323[1] adds new modal types for 8-bit floating point
intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the
FPMR.

This patch is an attempt to the add the mfloat8_t scalar type. It has a
parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format. But
maybe we should add another opaque type.

[1]  ARM-software/acle#323
SpencerAbson added a commit to llvm/llvm-project that referenced this pull request Nov 28, 2024
…116959)

This patch implements the following intrinsics:

8-bit floating-point convert to deinterleaved half-precision or
BFloat16.
``` c
  // Variant is also available for: _bf16[_mf8]_x2
  svfloat16x2_t svcvtl1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
  svfloat16x2_t svcvtl2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
```

Defined in ARM-software/acle#323

Co-authored-by: Caroline Concatto caroline.concatto@arm.com
Co-authored-by: Marian Lukac marian.lukac@arm.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.