This repository has been archived by the owner on Jan 29, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 193
Fixes for fma
function
#1580
Merged
Merged
Fixes for fma
function
#1580
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
beddc71
[hlsl-out] Write `mad` intrinsic for `fma` function
parasyte 5cec686
Add FMA feature to glsl backend
parasyte df97573
Transform GLSL fma function into an airthmetic expression when necessary
parasyte f4e2416
Add tests for GLSL fma function tranformation
parasyte 7454d1f
Remove the hazard comment from the webgl test input
parasyte 4f9addb
Add helper method for fma function support checks
parasyte 7c8bedc
Address review comment
parasyte File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
( | ||
glsl: ( | ||
version: Embedded(300), | ||
writer_flags: (bits: 0), | ||
binding_map: {}, | ||
), | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
fn test_fma() -> vec2<f32> { | ||
let a = vec2<f32>(2.0, 2.0); | ||
let b = vec2<f32>(0.5, 0.5); | ||
let c = vec2<f32>(0.5, 0.5); | ||
|
||
return fma(a, b, c); | ||
} | ||
|
||
|
||
[[stage(vertex)]] | ||
fn main() { | ||
let a = test_fma(); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
( | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
fn test_fma() -> vec2<f32> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should probably move a few things from |
||
let a = vec2<f32>(2.0, 2.0); | ||
let b = vec2<f32>(0.5, 0.5); | ||
let c = vec2<f32>(0.5, 0.5); | ||
|
||
// Hazard: HLSL needs a different intrinsic function for f32 and f64 | ||
// See: https://github.com/gfx-rs/naga/issues/1579 | ||
return fma(a, b, c); | ||
} | ||
|
||
|
||
[[stage(compute), workgroup_size(1)]] | ||
fn main() { | ||
let a = test_fma(); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
#version 300 es | ||
|
||
precision highp float; | ||
precision highp int; | ||
|
||
|
||
vec2 test_fma() { | ||
vec2 a = vec2(2.0, 2.0); | ||
vec2 b = vec2(0.5, 0.5); | ||
vec2 c = vec2(0.5, 0.5); | ||
return (a * b + c); | ||
} | ||
|
||
void main() { | ||
vec2 _e0 = test_fma(); | ||
return; | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
#version 310 es | ||
#extension GL_EXT_gpu_shader5 : require | ||
|
||
precision highp float; | ||
precision highp int; | ||
|
||
layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in; | ||
|
||
|
||
vec2 test_fma() { | ||
vec2 a = vec2(2.0, 2.0); | ||
vec2 b = vec2(0.5, 0.5); | ||
vec2 c = vec2(0.5, 0.5); | ||
return fma(a, b, c); | ||
} | ||
|
||
void main() { | ||
vec2 _e0 = test_fma(); | ||
return; | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
|
||
float2 test_fma() | ||
{ | ||
float2 a = float2(2.0, 2.0); | ||
float2 b = float2(0.5, 0.5); | ||
float2 c = float2(0.5, 0.5); | ||
return mad(a, b, c); | ||
} | ||
|
||
[numthreads(1, 1, 1)] | ||
void main() | ||
{ | ||
const float2 _e0 = test_fma(); | ||
return; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
vertex=() | ||
fragment=() | ||
compute=(main:cs_5_1 ) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
// language: metal1.1 | ||
#include <metal_stdlib> | ||
#include <simd/simd.h> | ||
|
||
|
||
metal::float2 test_fma( | ||
) { | ||
metal::float2 a = metal::float2(2.0, 2.0); | ||
metal::float2 b = metal::float2(0.5, 0.5); | ||
metal::float2 c = metal::float2(0.5, 0.5); | ||
return metal::fma(a, b, c); | ||
} | ||
|
||
kernel void main_( | ||
) { | ||
metal::float2 _e0 = test_fma(); | ||
return; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
; SPIR-V | ||
; Version: 1.1 | ||
; Generator: rspirv | ||
; Bound: 20 | ||
OpCapability Shader | ||
%1 = OpExtInstImport "GLSL.std.450" | ||
OpMemoryModel Logical GLSL450 | ||
OpEntryPoint GLCompute %16 "main" | ||
OpExecutionMode %16 LocalSize 1 1 1 | ||
%2 = OpTypeVoid | ||
%4 = OpTypeFloat 32 | ||
%3 = OpConstant %4 2.0 | ||
%5 = OpConstant %4 0.5 | ||
%6 = OpTypeVector %4 2 | ||
%9 = OpTypeFunction %6 | ||
%17 = OpTypeFunction %2 | ||
%8 = OpFunction %6 None %9 | ||
%7 = OpLabel | ||
OpBranch %10 | ||
%10 = OpLabel | ||
%11 = OpCompositeConstruct %6 %3 %3 | ||
%12 = OpCompositeConstruct %6 %5 %5 | ||
%13 = OpCompositeConstruct %6 %5 %5 | ||
%14 = OpExtInst %6 %1 Fma %11 %12 %13 | ||
OpReturnValue %14 | ||
OpFunctionEnd | ||
%16 = OpFunction %2 None %17 | ||
%15 = OpLabel | ||
OpBranch %18 | ||
%18 = OpLabel | ||
%19 = OpFunctionCall %6 %8 | ||
OpReturn | ||
OpFunctionEnd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
fn test_fma() -> vec2<f32> { | ||
let a = vec2<f32>(2.0, 2.0); | ||
let b = vec2<f32>(0.5, 0.5); | ||
let c = vec2<f32>(0.5, 0.5); | ||
return fma(a, b, c); | ||
} | ||
|
||
[[stage(compute), workgroup_size(1, 1, 1)]] | ||
fn main() { | ||
let _e0 = test_fma(); | ||
return; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need thus flag? If FMA isn't natively supported, we are emulating it anyway. So it seems to me that this flag isn't getting us anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For one thing, I'm following precedent with existing features that need extensions, and
fma
is only supported on GLES 3.1+ with:naga/tests/out/glsl/functions.main.Compute.glsl
Lines 1 to 2 in 7454d1f
This PR does not emulate
fma
on GLSL in every case, but decides if it must emulate it or else it requests the extension when necessary. This fixes that unusual validation error you noted earlier: https://github.com/gfx-rs/naga/runs/4446174291?check_suite_focus=trueI chose to use the existing feature flag infrastructure to write this extension, rather than coming up with something unique just for this case. Is there something better I could have done here?
The
FMA
feature flag name is probably too narrow, honestly.GL_EXT_gpu_shader5
enables a lot more than just thefma
function, and the feature flag can be used to support all of it on GLES.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The capability flags in GLSL backend are meant to be requirements. I.e. shader requires A, B, C, and we want to check if we can work with this shader at all.
The case for FMA is different. The backend always supports FMA instruction. The only thing different is a code path taken. Therefore, there is no case where GLSL backend would check for this capability and report it missing. It's not a real capability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand what you mean. Let me try to rephrase it; The
fma
function from the frontend is always supported by the backend (GLSL) even if it has to fallback to an arithmetic transformation (i.e., "emulated"). This PR uses a feature flag (capability) in another sense, that it can enable the use of the GLSLfma
function on particular versions of the backend. Which are not how feature flags are used elsewhere.Would that be an accurate way to describe the situation?
I'll have to think on it if I need to use some other mechanism to enable the extension for GLES. I do see an extension enabled that is not controlled by feature flags:
naga/src/back/glsl/mod.rs
Lines 442 to 450 in 7c8bedc