Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support half precision floating point types #313

Merged
merged 22 commits into from
Jan 24, 2024

Conversation

mairooni
Copy link
Collaborator

Description

This PR provides support for half-float types, i.e. float values represented by 16 bits instead of 32. Since Java does not currently offer a half-float type, in this implementation it is represented by a new class, HalfFloat.
The HalfFloat class offers two constructors. The first receives a float value, which is internally converted to float-16, using the Float.floatToFloat16 function. The second directly receives a short value, which represents a half-float.
The HalfFloat class exposes a set of operations that can be performed between half-float values, specifically addition, subtraction, multiplication, and division.
Finally, a new off-heap array type for half-float values (HalfFloatArray class) has been included.

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

Tests have been added for the following scenarios:

  1. Initialization of a new HalfFloat instance in the kernel:
    tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testHalfFloatInitialization
  2. Addition of two HalfFloatArrays:
    tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorAdditionHalfFloat
  3. Subtraction of two HalfFloatArrays:
    tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorSubtractionHalfFloat
  4. Multiplication of two HalfFloatArrays:
    tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorMultiplicationHalfFloat
  5. Division between two HalfFloatArrays:
    tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorDivisionHalfFloat
  6. Test the fromElements and fromArray functions of the HalfFloatArray class:
    tornado-test -V uk.ac.manchester.tornado.unittests.api.TestAPI#testSegmentsHalfFloats
  7. Test the fromSegment function of the HalfFloatArray class:
    tornado-test -V uk.ac.manchester.tornado.unittests.api.TestAPI#testBuildWithSegmentsHalfFloat

@mairooni mairooni added enhancement New feature or request API labels Jan 23, 2024
@mairooni mairooni self-assigned this Jan 23, 2024
@@ -191,7 +191,7 @@ def build_spirv_toolkit_and_level_zero(rebuild=False):

if (rebuild or build):
os.chdir(spirv_tool_kit)
subprocess.run(["git", "pull", "origin", "master"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be reverted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We keep this until we merge the code from the Beehive SPIR-V Toolkit

* The second {@code HalfFloat} input for the multiplication.
* @return A new {@HalfFloat} containing the results of the multiplication.
*/
public static HalfFloat mult(HalfFloat a, HalfFloat b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: any reason we called it mult and mul

@@ -74,6 +74,8 @@ public OCLAssembler(TargetDescription target) {
emitLine("#pragma OPENCL EXTENSION cl_khr_fp64 : enable ");
}

emitLine("#pragma OPENCL EXTENSION cl_khr_fp16 : enable ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, now we are going to emit this in all opencl kernels we generate, regardless if it uses half floating point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should emit this line only if the platform supports it.

@@ -74,6 +74,8 @@ public OCLAssembler(TargetDescription target) {
emitLine("#pragma OPENCL EXTENSION cl_khr_fp64 : enable ");
}

emitLine("#pragma OPENCL EXTENSION cl_khr_fp16 : enable ");

if (((OCLTargetDescription) target).supportsInt64Atomics()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a similar condition instead?

Copy link
Member

@jjfumero jjfumero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add the check for FP16 in OpenCL (pragma)

@jjfumero
Copy link
Member

Unit-test are passing for SPIR-V and PTX. For OpenCL, depending on the device, this passes.

In my case, using RTX 2060 and the NVIDIA Drivers 510.54 do not pass.

@mairooni
Copy link
Collaborator Author

mairooni commented Jan 24, 2024

I included a function that checks if FP16 is supported. However, since on my machine I do not have this capability but I can pass the tests for OpenCL, we decided with @jjfumero to keep the printing of the OpenCL pragma for the time being, regardless of whether it is supported, until we figure out why this is happening.

@jjfumero
Copy link
Member

Yes, we can query the capability. The pragma can always be generated. We need to investigate how to run OpenCL FP16 on modern NVIDIA GPUs.

@mikepapadim
Copy link
Member

Ok, lets merge it then and we can iterate for a fix in codegen when they add the extension in the driver

@jjfumero
Copy link
Member

It appears a new conflict. @mairooni, can you take a look?

@jjfumero jjfumero merged commit 2200228 into beehive-lab:develop Jan 24, 2024
1 check passed
@mairooni mairooni deleted the feat/float16 branch January 24, 2024 10:10
jjfumero added a commit that referenced this pull request Jan 30, 2024
TornadoVM 1.0.1
----------------
30/01/2024

Improvements
~~~~~~~~~~~~~~~~~~

- `#305 <https://github.com/beehive-lab/TornadoVM/pull/305>`_: Under-demand data transfer for custom data ranges.
- `#305 <https://github.com/beehive-lab/TornadoVM/pull/305>`_: Copy out subregions using the execution plan:
- `#313 <https://github.com/beehive-lab/TornadoVM/pull/313>`_: Initial support for Half-Precision (FP16) data types.
- `#311 <https://github.com/beehive-lab/TornadoVM/pull/311>`_: Enable Multi-Task Multiple Device (MTMD) model from the ``TornadoExecutionPlan`` API:
- `#315 <https://github.com/beehive-lab/TornadoVM/pull/315>`_: Math ``Ceil`` function added

Compatibility/Integration
~~~~~~~~~~~~~~~~~~~~~~~~~~~

- `#294 <https://github.com/beehive-lab/TornadoVM/pull/294>`_: Separation of the OpenCL Headers from the code base.
- `#297 <https://github.com/beehive-lab/TornadoVM/pull/297>`_: Separation of the LevelZero JNI API in a separate repository.
- `#301 <https://github.com/beehive-lab/TornadoVM/pull/301>`_: Temurin configuration supported.
- `#304 <https://github.com/beehive-lab/TornadoVM/pull/304>`_: Refactor of the common phases for the JIT compiler.
- `#316 <https://github.com/beehive-lab/TornadoVM/pull/316>`_: Beehive SPIR-V Toolkit version updated.

Bug Fixes
~~~~~~~~~~~~~~~~~~

- `#298 <https://github.com/beehive-lab/TornadoVM/pull/298>`_: OpenCL Codegen fixed open-close brackets.
- `#300 <https://github.com/beehive-lab/TornadoVM/pull/300>`_: Python Dependencies fixed for AWS
- `#308 <https://github.com/beehive-lab/TornadoVM/pull/308>`_: Runtime check for Grid-Scheduler names
- `#309 <https://github.com/beehive-lab/TornadoVM/pull/309>`_: Fix check-style to support STR templates
- `#314 <https://github.com/beehive-lab/TornadoVM/pull/314>`_: emit Vector16 Capability for 16-width vectors
jjfumero added a commit that referenced this pull request Jan 30, 2024
Improvements
~~~~~~~~~~~~~~~~~~

- `#305 <https://github.com/beehive-lab/TornadoVM/pull/305>`_: Under-demand data transfer for custom data ranges.
- `#313 <https://github.com/beehive-lab/TornadoVM/pull/313>`_: Initial support for Half-Precision (FP16) data types.
- `#311 <https://github.com/beehive-lab/TornadoVM/pull/311>`_: Enable Multi-Task Multiple Device (MTMD) model from the ``TornadoExecutionPlan`` API.
- `#315 <https://github.com/beehive-lab/TornadoVM/pull/315>`_: Math ``Ceil`` function added.

Compatibility/Integration
~~~~~~~~~~~~~~~~~~~~~~~~~~~

- `#294 <https://github.com/beehive-lab/TornadoVM/pull/294>`_: Separation of the OpenCL Headers from the code base.
- `#297 <https://github.com/beehive-lab/TornadoVM/pull/297>`_: Separation of the LevelZero JNI API in a separate repository.
- `#301 <https://github.com/beehive-lab/TornadoVM/pull/301>`_: Temurin configuration supported.
- `#304 <https://github.com/beehive-lab/TornadoVM/pull/304>`_: Refactor of the common phases for the JIT compiler.
- `#316 <https://github.com/beehive-lab/TornadoVM/pull/316>`_: Beehive SPIR-V Toolkit version updated.

Bug Fixes
~~~~~~~~~~~~~~~~~~

- `#298 <https://github.com/beehive-lab/TornadoVM/pull/298>`_: OpenCL Codegen fixed open-close brackets.
- `#300 <https://github.com/beehive-lab/TornadoVM/pull/300>`_: Python Dependencies fixed for AWS.
- `#308 <https://github.com/beehive-lab/TornadoVM/pull/308>`_: Runtime check for Grid-Scheduler names.
- `#309 <https://github.com/beehive-lab/TornadoVM/pull/309>`_: Fix check-style to support STR templates.
- `#314 <https://github.com/beehive-lab/TornadoVM/pull/314>`_: emit Vector16 Capability for 16-width vectors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

3 participants