-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Float16 Images? #1140
Comments
I'm not sure how useful it would be to have 16-bit float return values from such samplers. There could be some use case for storage images (particularly writing values), but I don't understand why samplers, complete with filtering, would need this. The thread suggested register pressure as a reason to want 16-bit return values directly. But it seems to me that a platform where both register pressure from 32-bit floats is an issue and where the hardware can actually fetch 16-bit filtered values from a texture is probably smart enough to see the conversion to 16-bit values immediately after the fetch operation in SPIR-V and simply perform the more efficient operation. That is, if there is some performance benefit, I would think that |
Feels like it should just be there.
Would the case be different if we were speaking about
I am not entirely convinced that is in the spirit of explicitness. |
Yes. Having a format encoded into the sampler type directly like this is only useful when the default case can't provide equivalent results.
Are you suggesting we create a whole new set of sampler types just for
Neither is the fact that shader writers aren't required to qualify branches as to whether they will introduce divergence. Yet that's how SPIR-V works. This is treated as an implementation detail that compilers have to work out on their own, for their own hardware. I don't see why this would be different. |
There's something to be said about API consistency though.
Never! Just prodding for holes.
By all means.
Um, branches on variables are divergent. Why would anyone need to qualify a naturally divergent thing as divergent. Anyway, I accept your point even without dangerous analogies. |
FWIW we've (AMD) adopted roughly the approach @NicolBolas outlined above, which is why we've not pushed for it to go into core Vulkan. However we do allow RelaxedPrecision decorations on samplers which some vendors probably do use, so maybe there's an orthogonality issue here? There are probably some weird corner cases where being explicit could be useful as well. The WG hasn't actively considered this request before but we'll take a look at it and let you know what happens. Thanks! |
There's currently no way I've found to get transpiled texture2d from GLSL without this extension implemented. MSL supports f16sampler2d and ftexture2d, but if it can't be expressed, then repeated casts are needed, and it's unclear that the optimizer simplfies this casting logic. I don't see this extension on any platform, or even on AMD which defined the extension on gpuinfo.org. The point is not for the sampler to do work in f16, but for the type and end result to be correct. Here's what I get now in the MSL. mediump doesn't work in transpiled code, with all expressions using full float. I'm switching to float16_t, but that has all sorts of missing storage extensions for inputOuput16 interpolation on Adreno and Nvidia.
|
I'm here because I was appalled to see there is no Metal has it. It's expressed with
Because developer sanity is an issue. I'm porting 32-bit code to work in 16-bit and I prefer to do the following: uniform f16texture2D myTexture;
half4 a = texture( myTexture, uv0 );
half4 b = texture( myTexture, uv1 );
half3 c = texture( myTexture, uv2 ).xyz;
half3 d = texture( myTexture, uv3 ).xyz;
x = a + b;
y = c + d; Over the following: uniform texture2D myTexture;
half4 a = half4( texture( myTexture, uv0 ) );
half4 b = half4( texture( myTexture, uv1 ) );
half3 c = half3( texture( myTexture, uv2 ).xyz;
half3 d = half3( texture( myTexture, uv3 ).xyz );
x = a + b;
y = c + d; Readability suffers. Porting efforts are larger. It's also error prone e.g. typing The first bit of code works if I compile with
I disagree. We want to support GPUs with and without 16-bit support. And we're extremely likely to use texelFetch on both paths (i.e. share the same line of code). If f16texelFetch appears, it's just going to be an unnecessary PITA,
|
In terms of current behavior we (Arm) will use fp16 either if the sampler is tagged as
... or ...
|
@darksylinc if GLSL is the issue then we could just get GLSLang to generate the right code under the covers? Vendors all do the half float optimisation where appropriate already, and there's nothing stopping us just making the high level language simpler here. If we do this as a Vulkan/SPIR-V change then it'd rely on driver updates which aren't always possible (e.g. on some mobile devices). If the high level code is the problem, perhaps we should fix the high level language. If that's sufficient for you, then that's a much more solvable problem? |
Yes. Each problem in each domain: if spirv wants stay its current way for technical reasons (e.g. easier to support, maps better to hw capabilities) then it should stay tht way But as for GLSL, the current status in insanity in terms of readability and maintenance and could use quality of life improvements |
Again, relaxed precision designators have no bearing on transpiled codegen. This needs to be formalized as part of the glsl syntax which it already is with the AMD code extension, or spirv-cross needs to gen the correct use of texture. Also casting to mediump isn’t possible with macros to half since it’s not a type. |
Ditto on this. The following GLSL code won't work: #define half2 mediump vec2
half2 a = half2( 0, 0 ); // error (undesired)
half2 a = vec2( 0, 0 ); // ok (undesired) But it does with explicit float16_t #define half2 f16vec2
half2 a = half2( 0, 0 ); // ok
half2 a = vec2( 0, 0 ); // cast error (desired)
Ah yeah, although that's outside of what I need, he makes a good point: the information is lost in SPIRV, thus it cannot be properly translated when transpiling to Metal. Valve talks about this in https://www.lunarg.com/wp-content/uploads/2019/09/Automatic-RelaxedPrecision-Decoration-and-Conversion-in-Spirv-Opt_r1.pdf |
There seems to be lacking support for Float16 (same for the other extra types, really) Samplers, and Storage Images. Storage Images feel like they should be supported without reservations.
The
VK_KHR_16bit_storage
extension does not support this.GL_AMD_gpu_shader_half_float_fetch
can do this, but there seems to be missing Vulkan extension acceptingSPV_AMD_gpu_shader_half_float_fetch
shader.Migrated from https://community.khronos.org/t/why-vulkan-spir-v-opimageread-opimagefetch-opimagewrite-always-use-v4float/105009
The text was updated successfully, but these errors were encountered: