Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix glsl shader for Android Mali-GXXx GPUs and Vulkan API 1.3.xxx #92817

Merged
merged 1 commit into from
Jul 22, 2024

Conversation

Alex2782
Copy link
Contributor

@Alex2782 Alex2782 commented Jun 5, 2024

Fixes #86621

Issue #73384 maybe only partially, not 3D on PowerVR Rogue GPUs comment


Test on Firebase

Test Project: GlowTest.zip

  • Glow on/off
  • DOF Blur on/off
  • rendering/scaling_3d/scale = 0.5
web-build_20240606_x988_panther-33-en_US-portrait_video.mp4
  • Glow
  • DOF Blur
  • Rendering Scale 3D

@Alex2782 Alex2782 force-pushed the fix_glsl_Mali-G branch 2 times, most recently from 1658a3d to bf06a1d Compare June 5, 2024 21:27
@Calinou Calinou added this to the 4.3 milestone Jun 5, 2024
@Calinou Calinou added the cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release label Jun 5, 2024
@Alex2782 Alex2782 force-pushed the fix_glsl_Mali-G branch 3 times, most recently from 767facd to 7fea028 Compare June 6, 2024 15:14
@Alex2782 Alex2782 marked this pull request as ready for review June 6, 2024 15:58
@Alex2782 Alex2782 requested a review from a team as a code owner June 6, 2024 15:58
@clayjohn
Copy link
Member

clayjohn commented Jun 14, 2024

I have taken a close look at this and have reached out to ARM engineers for assistance. This is clearly an ARM driver bug that only impacts newer Mali devices.

To fix #86621, we only have to make the change in tonemap.glsl (and the other files that have exactly the same vertex shader), we don't need to remove all array accesses.

It seems the problem actually comes from the clamp() function call that clamps the UV coordinates. The UV coordinates should be (0,0), (0,2), (2,0). But on the broken devices the coordinates are (-2,-2), (-2,2), (2,-2). This is consistent with the clamp failing to clamp to (0,0). In fact, if you change the line to uv_interp = clamp(gl_Position.xy, vec2(-1.0, -1.0), vec2(1.0, 1.0)) * 2.0; // saturate(x) * 2.0, then desktop behaves the same as the problematic Mali devices.

ARM is taking a look as it seems this is reproducible in newer driver versions (I confirmed in driver version 46.0.0). They may be able to help us identify a more elegant workaround.

I think we should wait to hear back from them as the current workaround will have a performance impact as it emits many more instructions including branches. Ideally, we should be able to workaround the bug with something much less intrusive.

@Alex2782
Copy link
Contributor Author

This is a strange UV bug, without if / else if I couldn't find another solution.

some GLSL tests
void main() {
	vec2 base_arr[3] = vec2[](vec2(-1.0, -1.0), vec2(-1.0, 3.0), vec2(3.0, -1.0));
	//gl_Position = vec4(base_arr[gl_VertexIndex], 0.0, 1.0);
	
	//BUG
	//vec2 base = gl_Position.xy;
	//uv_interp = clamp(base, vec2(0.0, 0.0), vec2(1.0, 1.0)) * 2.0; // saturate(x) * 2.0
	
	//BUG
	//uv_interp.x = clamp(gl_Position.x, 0.0, 1.0) * 2.0;
	//uv_interp.y = clamp(gl_Position.y, 0.0, 1.0) * 2.0;

	//BUG
	// GLSL Docs: clamp returns the value of x constrained to the range minVal to maxVal. 
	// The returned value is computed as min(max(x, minVal), maxVal).
	//uv_interp = min(max(gl_Position.xy, vec2(0.0, 0.0)), vec2(1.0, 1.0)) * 2.0;

	
	int vertex_index = gl_VertexIndex;
	vec2 vertex_base = base_arr[vertex_index];


	//OK
	/*
	if (vertex_index == 0) {
		vertex_base = base_arr[0];
	} else if (vertex_index == 1) {
		vertex_base = base_arr[1];
	} else {
		vertex_base = base_arr[2];
	}
	*/

	//BUG
	/*
	if (vertex_index == 0) {
		vertex_base = base_arr[gl_VertexIndex];
	} else if (vertex_index == 1) {
		vertex_base = base_arr[gl_VertexIndex];
	} else {
		vertex_base = base_arr[gl_VertexIndex];
	}
	*/	

	//OK
	/*
	if (vertex_index == 0) {
		vertex_base = base_arr[vertex_index];
	} else if (vertex_index == 1) {
		vertex_base = base_arr[vertex_index];
	} else {
		vertex_base = base_arr[vertex_index];
	}
	*/
	
	//BUG
	/*
	for (int i=0; i < 100; i++) {
		vertex_base = base_arr[vertex_index];
	}
	*/

	//BUG
	for (int i=0; i < 3; i++) {
		if (vertex_index == 0 || vertex_index == 1 || vertex_index != 0) {
			vertex_base = base_arr[vertex_index];
		}
	}
	
	gl_Position = vec4(vertex_base, 0.0, 1.0);
	uv_interp = clamp(vertex_base, vec2(0.0, 0.0), vec2(1.0, 1.0)) * 2.0; // saturate(x) * 2.0

}

Fragment: UV debug
	if (uv_interp.x < -0.5 && uv_interp.y < -0.5) {
		color += vec4(0.2, 0.0, 0.0, 1.0);
	}
	else if (uv_interp.x < -0.2 && uv_interp.y < -0.2) {
		color += vec4(0.0, 0.2, 0.0, 1.0);
	}
	else if (uv_interp.x < 0.0 && uv_interp.y < 0.0) {
		color += vec4(0.0, 0.0, 0.2, 1.0);
	}
	else if (uv_interp.x > 0.5 && uv_interp.y > 0.5) {
		color += vec4(0.0, 0.2, 0.2, 1.0);
	}
	else if (uv_interp.x > 0.2 && uv_interp.y > 0.2) {
		color += vec4(0.2, 0.0, 0.2, 1.0);
	}	
	else if (uv_interp.x > 0.0 && uv_interp.y > 0.0) {
		color += vec4(0.2, 0.2, 0.0, 1.0);
	}

	frag_color = color;
}
UV-BUG (negative values) UV-OK (positive values)
image image

@clayjohn
Copy link
Member

clayjohn commented Jun 19, 2024

An update on this. ARM has confirmed that the internal shader compiler is replacing the clamp() function with a single min() operation. For some reason the compiler doesn't make this mistake when using the branches. However, we should avoid using branches for performance reasons.

I have asked if they know of a better workaround

@Alex2782
Copy link
Contributor Author

the internal shader compiler is replacing the clamp() function with a single min() operation.

A few days ago I tried it without the clamp function, same issue.

	//BUG
	// GLSL Docs: clamp returns the value of x constrained to the range minVal to maxVal. 
	// The returned value is computed as min(max(x, minVal), maxVal).
	uv_interp = min(max(gl_Position.xy, vec2(0.0, 0.0)), vec2(1.0, 1.0)) * 2.0;

@clayjohn
Copy link
Member

A few days ago I tried it without the clamp function, same issue.

I tried the same. ARM confirmed it results in the same code as clamp()

@Alex2782
Copy link
Contributor Author

Alex2782 commented Jul 17, 2024

Should we wait longer?

I think we should wait to hear back from them as the current workaround will have a performance impact as it emits many more instructions including branches. Ideally, we should be able to workaround the bug with something much less intrusive.

However, we should avoid using branches for performance reasons.

I could also solve without #include and only in GLSL files for Glow, DOF Blur and Rendering Scale 3D.

Bunnymark on Adreno 650 (3y old device), GLES3 commits canvas.glsl if / else if was more performant: #87352 (comment) as an array with 6 vec2 values.

Details
  • Samsung Tab S7 (Android 13)
  • Debug App
  • 2k resolution (no stretch)

I ran 2 tests for each:

  • benchmark output: 10960 and 10820 with if / else if
  • benchmark output: 10900 and 10840 with is_x_set / is_y_set
  • benchmark output: 10660 and 10760 with vertex_base_arr[6] and gl_VertexID % 6
  • benchmark output: 10860 and 10700 with vertex_base_arr[4] and gl_VertexID % 4

Screenshot_20240210_042749

I think that the performance improvements or deteriorations will not be measurable because Glow etc. is only executed once per frame.

@akien-mga
Copy link
Member

I tested this PR (rebased on latest master) on a Google Pixel 7a (Mali-G710) and I confirm this fixes the issue for me.

@clayjohn
Copy link
Member

clayjohn commented Jul 19, 2024

I could also solve without #include and only in GLSL files for Glow, DOF Blur and Rendering Scale 3D.

Let's do this for now. I suspect this will slightly harm performance of those shaders on modern devices. However, as you pointed out, each of those shaders run at most a few times per frame, so the overall performance impact would be minimal.

I would like to find a more elegant workaround in the future. But we should solve this for 4.3.

So please go ahead and apply your fix directly to Glow, DoF Blur, and Rendering Scale 3D and I think Tonemap.glsl needs it as well

Also, for context, we are still waiting to hear back from ARM. But we can't hold off the release while we wait, so it is best to go forward with a known workaround in the meantime

@Alex2782
Copy link
Contributor Author

  • tonemap.glsl for Glow and Rendering Scale 3D
  • and bokeh_dof_raster.glsl for DOF Blur

Test on Pixel 7a

Screenshot 2024-07-20 210428

Alex2782

This comment was marked as outdated.

Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you.

Let's go with this workaround for 4.3 and then hopefully do something else once we have a different workaround

@akien-mga akien-mga merged commit 587f1d0 into godotengine:master Jul 22, 2024
18 checks passed
@akien-mga
Copy link
Member

Thanks!

@Alex2782 Alex2782 deleted the fix_glsl_Mali-G branch July 22, 2024 17:43
@clayjohn
Copy link
Member

For posterity, ARM has confirmed that this is a driver bug (now fixed in r51) and using branches as we did is the best workaround. The alternative is to use a uniform buffer to store the position array

@Alex2782 Alex2782 restored the fix_glsl_Mali-G branch August 1, 2024 00:12
@Alex2782 Alex2782 deleted the fix_glsl_Mali-G branch August 1, 2024 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Glow covering screen on Android with single color
4 participants