-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support half precision floating point types #313
Conversation
@@ -191,7 +191,7 @@ def build_spirv_toolkit_and_level_zero(rebuild=False): | |||
|
|||
if (rebuild or build): | |||
os.chdir(spirv_tool_kit) | |||
subprocess.run(["git", "pull", "origin", "master"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We keep this until we merge the code from the Beehive SPIR-V Toolkit
* The second {@code HalfFloat} input for the multiplication. | ||
* @return A new {@HalfFloat} containing the results of the multiplication. | ||
*/ | ||
public static HalfFloat mult(HalfFloat a, HalfFloat b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: any reason we called it mult and mul
@@ -74,6 +74,8 @@ public OCLAssembler(TargetDescription target) { | |||
emitLine("#pragma OPENCL EXTENSION cl_khr_fp64 : enable "); | |||
} | |||
|
|||
emitLine("#pragma OPENCL EXTENSION cl_khr_fp16 : enable "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, now we are going to emit this in all opencl kernels we generate, regardless if it uses half floating point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should emit this line only if the platform supports it.
@@ -74,6 +74,8 @@ public OCLAssembler(TargetDescription target) { | |||
emitLine("#pragma OPENCL EXTENSION cl_khr_fp64 : enable "); | |||
} | |||
|
|||
emitLine("#pragma OPENCL EXTENSION cl_khr_fp16 : enable "); | |||
|
|||
if (((OCLTargetDescription) target).supportsInt64Atomics()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a similar condition instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add the check for FP16 in OpenCL (pragma)
Unit-test are passing for SPIR-V and PTX. For OpenCL, depending on the device, this passes. In my case, using RTX 2060 and the NVIDIA Drivers 510.54 do not pass. |
I included a function that checks if FP16 is supported. However, since on my machine I do not have this capability but I can pass the tests for OpenCL, we decided with @jjfumero to keep the printing of the OpenCL pragma for the time being, regardless of whether it is supported, until we figure out why this is happening. |
Yes, we can query the capability. The pragma can always be generated. We need to investigate how to run OpenCL FP16 on modern NVIDIA GPUs. |
Ok, lets merge it then and we can iterate for a fix in codegen when they add the extension in the driver |
It appears a new conflict. @mairooni, can you take a look? |
TornadoVM 1.0.1 ---------------- 30/01/2024 Improvements ~~~~~~~~~~~~~~~~~~ - `#305 <https://github.com/beehive-lab/TornadoVM/pull/305>`_: Under-demand data transfer for custom data ranges. - `#305 <https://github.com/beehive-lab/TornadoVM/pull/305>`_: Copy out subregions using the execution plan: - `#313 <https://github.com/beehive-lab/TornadoVM/pull/313>`_: Initial support for Half-Precision (FP16) data types. - `#311 <https://github.com/beehive-lab/TornadoVM/pull/311>`_: Enable Multi-Task Multiple Device (MTMD) model from the ``TornadoExecutionPlan`` API: - `#315 <https://github.com/beehive-lab/TornadoVM/pull/315>`_: Math ``Ceil`` function added Compatibility/Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - `#294 <https://github.com/beehive-lab/TornadoVM/pull/294>`_: Separation of the OpenCL Headers from the code base. - `#297 <https://github.com/beehive-lab/TornadoVM/pull/297>`_: Separation of the LevelZero JNI API in a separate repository. - `#301 <https://github.com/beehive-lab/TornadoVM/pull/301>`_: Temurin configuration supported. - `#304 <https://github.com/beehive-lab/TornadoVM/pull/304>`_: Refactor of the common phases for the JIT compiler. - `#316 <https://github.com/beehive-lab/TornadoVM/pull/316>`_: Beehive SPIR-V Toolkit version updated. Bug Fixes ~~~~~~~~~~~~~~~~~~ - `#298 <https://github.com/beehive-lab/TornadoVM/pull/298>`_: OpenCL Codegen fixed open-close brackets. - `#300 <https://github.com/beehive-lab/TornadoVM/pull/300>`_: Python Dependencies fixed for AWS - `#308 <https://github.com/beehive-lab/TornadoVM/pull/308>`_: Runtime check for Grid-Scheduler names - `#309 <https://github.com/beehive-lab/TornadoVM/pull/309>`_: Fix check-style to support STR templates - `#314 <https://github.com/beehive-lab/TornadoVM/pull/314>`_: emit Vector16 Capability for 16-width vectors
Improvements ~~~~~~~~~~~~~~~~~~ - `#305 <https://github.com/beehive-lab/TornadoVM/pull/305>`_: Under-demand data transfer for custom data ranges. - `#313 <https://github.com/beehive-lab/TornadoVM/pull/313>`_: Initial support for Half-Precision (FP16) data types. - `#311 <https://github.com/beehive-lab/TornadoVM/pull/311>`_: Enable Multi-Task Multiple Device (MTMD) model from the ``TornadoExecutionPlan`` API. - `#315 <https://github.com/beehive-lab/TornadoVM/pull/315>`_: Math ``Ceil`` function added. Compatibility/Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - `#294 <https://github.com/beehive-lab/TornadoVM/pull/294>`_: Separation of the OpenCL Headers from the code base. - `#297 <https://github.com/beehive-lab/TornadoVM/pull/297>`_: Separation of the LevelZero JNI API in a separate repository. - `#301 <https://github.com/beehive-lab/TornadoVM/pull/301>`_: Temurin configuration supported. - `#304 <https://github.com/beehive-lab/TornadoVM/pull/304>`_: Refactor of the common phases for the JIT compiler. - `#316 <https://github.com/beehive-lab/TornadoVM/pull/316>`_: Beehive SPIR-V Toolkit version updated. Bug Fixes ~~~~~~~~~~~~~~~~~~ - `#298 <https://github.com/beehive-lab/TornadoVM/pull/298>`_: OpenCL Codegen fixed open-close brackets. - `#300 <https://github.com/beehive-lab/TornadoVM/pull/300>`_: Python Dependencies fixed for AWS. - `#308 <https://github.com/beehive-lab/TornadoVM/pull/308>`_: Runtime check for Grid-Scheduler names. - `#309 <https://github.com/beehive-lab/TornadoVM/pull/309>`_: Fix check-style to support STR templates. - `#314 <https://github.com/beehive-lab/TornadoVM/pull/314>`_: emit Vector16 Capability for 16-width vectors.
Description
This PR provides support for half-float types, i.e. float values represented by 16 bits instead of 32. Since Java does not currently offer a half-float type, in this implementation it is represented by a new class,
HalfFloat
.The
HalfFloat
class offers two constructors. The first receives a float value, which is internally converted to float-16, using theFloat.floatToFloat16
function. The second directly receives a short value, which represents a half-float.The
HalfFloat
class exposes a set of operations that can be performed between half-float values, specifically addition, subtraction, multiplication, and division.Finally, a new off-heap array type for half-float values (
HalfFloatArray
class) has been included.Backend/s tested
Mark the backends affected by this PR.
OS tested
Mark the OS where this PR is tested.
Did you check on FPGAs?
If it is applicable, check your changes on FPGAs.
How to test the new patch?
Tests have been added for the following scenarios:
HalfFloat
instance in the kernel:tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testHalfFloatInitialization
tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorAdditionHalfFloat
tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorSubtractionHalfFloat
tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorMultiplicationHalfFloat
tornado-test -V uk.ac.manchester.tornado.unittests.arrays.TestArrays#testVectorDivisionHalfFloat
fromElements
andfromArray
functions of theHalfFloatArray
class:tornado-test -V uk.ac.manchester.tornado.unittests.api.TestAPI#testSegmentsHalfFloats
fromSegment
function of theHalfFloatArray
class:tornado-test -V uk.ac.manchester.tornado.unittests.api.TestAPI#testBuildWithSegmentsHalfFloat