Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CPU/GPUParticles2D bugs on Compatibility Rendering (GLES3) on Adreno 3XX devices. #88816

Merged
merged 1 commit into from
Apr 4, 2024

Conversation

joined72
Copy link
Contributor

@joined72 joined72 commented Feb 25, 2024

Fixes #88815
This PR is the natural completion of #87352

Tested Devices:

  • Sony Xperia M2 (Adreno 305) - FIXED
  • Huawei MediaPad T3 10 (Adreno 308) - FIXED
  • Samsung Galaxy S4 (Adreno 320) - FIXED
  • LG Nexus 5 (Adreno 330) - FIXED
  • Samsung Note 4 (Adreno 420) - NOT AFFECTED
  • Huawei Honor 7A (Adreno 505) - NOT AFFECTED
  • LG G5 (Adreno 530) - NOT AFFECTED
  • Xiomi Redmi 12 (Adreno 610) - NOT AFFECTED
  • Huawei Honor 9 Lite (Mali T830) - NOT AFFECTED

This PR allows a game that uses GPUParticles2D on older Adreno 3XX devices to start and run fine (without any freeze) but not allow to use GPUParticles2D (however, it allows you to fully use the CPUParticles2D, which otherwise would not work at all)...

Let me Explain...

After fixed all the incompatibilities of the Particle Shaders that caused the game to freeze at startup (only on Adreno 3XX devices), I saw that the particles were still NOT displayed even though the FPS drop showed that they were still processed.
The drop in FPS was truly notable, even just inserting an "empty" GPUParticles2D Node, there was a drop of 11/13 FPS.

At this point I completely disabled the GPUParticles2D functionality (only on Adreno 3XX devices) because trying to go further no longer made sense since these devices are too old and slow to be able to handle particles smoothly.

So the scope of this PR is ONLY let a game with GPUParticles2D start and run fine (without Freeze) on old Adreno 3XX devices (but without any particles).

Screenshot of a demo game on a LG Nexus 5 (Adreno 330) before the fixes...
image

After the fixes... (you cannot see the Particles because was disabled, but at least the game is playable)
image

@joined72 joined72 requested a review from a team as a code owner February 25, 2024 14:10
@joined72 joined72 force-pushed the gles3_gpuparticles2d_fixes branch 4 times, most recently from 55762de to 53a61b0 Compare February 25, 2024 15:07
@joined72
Copy link
Contributor Author

@clayjohn can you review it, please?

@joined72 joined72 changed the title Fix GPUParticles2D on Compatibility Rendering (GLES3) on Adreno 3XX devices. Fix GPUParticles2D Freeze on Compatibility Rendering (GLES3) on Adreno 3XX devices. Feb 25, 2024
@AThousandShips AThousandShips added this to the 4.3 milestone Feb 25, 2024
@joined72
Copy link
Contributor Author

@Calinou do you think this PR can be cherry-picked for the 4.2.2 release?

@clayjohn
Copy link
Member

@joined72 I expect that the particles process material would be more problematic than actually drawing the particles as drawing particles works the same way as drawing CPU particles or drawing MultiMeshes.

Have you tested using CPUParticles2D instead of GPUParticles2D? It would be helpful to know if CPUParticles2D causes the same crash

@joined72
Copy link
Contributor Author

joined72 commented Feb 27, 2024

@joined72 I expect that the particles process material would be more problematic than actually drawing the particles as drawing particles works the same way as drawing CPU particles or drawing MultiMeshes.

Have you tested using CPUParticles2D instead of GPUParticles2D? It would be helpful to know if CPUParticles2D causes the same crash

@clayjohn CPUParticles2D don't cause any crash, but still not be showed and don't decrease FPS, so I focused only on GPUParticles2D.

EDIT: if you think I can focus only on the scene/2d/gpu_particles_2d.cpp source file.

@joined72
Copy link
Contributor Author

@clayjohn

Let me Explain in details...

The freeze was caused by an incompatibility in the drivers/gles3/shaders/particles.glsl Shader.

Fixed that, I found that the drivers/gles3/shaders/particles_copy.glsl Shader not compiled for a syntax error.

Fixed that I thought to see the GPUParticles2D on the Adreno 3XX device, but NO, it didn't work!

While I was working on showing the particles, I understand that the device is too old and slow to handle particles smoothly, so I decided to disable GPUParticles2D on Adreno 3XX devices.

At the start, I done this working on the scene/2d/gpu_particles_2d.cpp source file (at the GPUParticles2D::set_emitting and GPUParticles2D::is_emitting methods) after I decided to move the "disabling" logic at the "rendering level" (working on the drivers/gles3/storage/particles_storage.cpp source file).

PS: from my test CPUParticles2D doesn't crash Adreno 3XX devices, but particles are still not showed, and FPS not decrease.

@clayjohn
Copy link
Member

@clayjohn

Let me Explain in details...

The freeze was caused by an incompatibility in the drivers/gles3/shaders/particles.glsl Shader.

Fixed that, I found that the drivers/gles3/shaders/particles_copy.glsl Shader not compiled for a syntax error.

Fixed that I thought to see the GPUParticles2D on the Adreno 3XX device, but NO, it didn't work!

While I was working on showing the particles, I understand that the device is too old and slow to handle particles smoothly, so I decided to disable GPUParticles2D on Adreno 3XX devices.

At the start, I done this working on the scene/2d/gpu_particles_2d.cpp source file (at the GPUParticles2D::set_emitting and GPUParticles2D::is_emitting methods) after I decided to move the "disabling" logic at the "rendering level" (working on the drivers/gles3/storage/particles_storage.cpp source file).

PS: from my test CPUParticles2D doesn't crash Adreno 3XX devices, but particles are still not showed, and FPS not decrease.

Thank you for clarifying! That makes sense to me now.

So it seems the problem is that MultiMeshes are not drawing correctly in the 2D renderer. I don't think that is a limitation of Adreno 3XX though. It is more likely a problem with our code since regular rect drawing uses a similar pathway to MultiMeshes. I'm assuming you are testing on master in which case, you could actually be seeing a recently introduced issue that was fixed yesterday by #88938. Could you test with CPUParticles again on the most recent master branch and see if they are visible now?

To be clear, I think we should investigate why particles are not appearing instead of disabling them on Adreno devices.

@joined72
Copy link
Contributor Author

Could you test with CPUParticles again on the most recent master branch and see if they are visible now?

Ok, I'll update ASAP.

To be clear, I think we should investigate why particles are not appearing instead of disabling them on Adreno devices.

Great 😃

@joined72
Copy link
Contributor Author

joined72 commented Feb 29, 2024

@clayjohn I'm sorry but the most recent master branch has the same issue, CPUParticles2D aren't showed and the FPS aren't dropped.

Based on my experience, the problem "should" be using some form of Shader that uses GLSL code incompatible with Adreno 3XX but compatible with all other devices.

I tried checking in the scene/2d/cpu_particles_2d.cpp sources, but I didn't find anything like that.

Could you give me some directions?

EDIT: is possible the Adreno 3XX Shader "incompatibility" is in the Shader generated by the scene/resources/particle_process_material.cpp sources?

void ParticleProcessMaterial::_update_shader() {

@clayjohn
Copy link
Member

@clayjohn I'm sorry but the most recent master branch has the same issue, CPUParticles2D aren't showed and the FPS aren't dropped.

Based on my experience, the problem "should" be using some form of Shader that uses GLSL code incompatible with Adreno 3XX but compatible with all other devices.

I tried checking in the scene/2d/cpu_particles_2d.cpp sources, but I didn't find anything like that.

Could you give me some directions?

EDIT: is possible the Adreno 3XX Shader "incompatibility" is in the Shader generated by the scene/resources/particle_process_material.cpp sources?

void ParticleProcessMaterial::_update_shader() {

Based on your description, the only shader that could be causing this issue is https://github.com/godotengine/godot/blob/master/drivers/gles3/shaders/canvas.glsl Particularly with the USE_ATTRIBUTES and USE_INSTANCING defines. To further confirm, can you also test a scene with a MeshInstance2D and a scene with a MultiMeshInstance2D?

If the MeshInstance2D does not appear, then the issue is with USE_ATTRIBUTES If the MeshInstance2D appears, but the MultiMeshInstance2D does not appear, then the issue is with USE_INSTANCING.

If the MultiMeshInstance2D does appear correctly. Then the issue is most likely with using both custom color and custom data. So try enabling those on the MultiMeshInstance2D. CPUParticles internally are just a MultiMeshInstance2D with both custom color and custom data enabled.

@clayjohn
Copy link
Member

On a side note, in the rendering meeting we discussed if there would be a performance impact to using an if/else if branch instead of a switch statement.

I took a look and confirmed that on AMD the generated bytecode is the same between the two approaches:

Switch: https://shader-playground.timjones.io/6145551406931ced2e87820428c9e21f
If/else: https://shader-playground.timjones.io/e7533bf63f1c292c13f2fffa8c0e76e2

@joined72
Copy link
Contributor Author

joined72 commented Mar 1, 2024

On a side note, in the rendering meeting we discussed if there would be a performance impact to using an if/else if branch instead of a switch statement.

I took a look and confirmed that on AMD the generated bytecode is the same between the two approaches:

Switch: https://shader-playground.timjones.io/6145551406931ced2e87820428c9e21f If/else: https://shader-playground.timjones.io/e7533bf63f1c292c13f2fffa8c0e76e2

Really Great, thanks for the update! ;)

PS: I'm working on the MeshInstance2D and MultiMeshInstance2D testing, and found that while MeshInstance2D works well, using a MultiMeshInstance2D node on an Adreno 3XX device crashes the app at the start.
I'm doing more investigations...

UPDATE:

Logcat...
I art     : Rejecting re-init on previously-failed class java.lang.Class<org.godotengine.godot.Godot$onInitRenderView$2>
I art     : Rejecting re-init on previously-failed class java.lang.Class<org.godotengine.godot.Godot$onInitRenderView$2>
D OpenGLRenderer: Use EGL_SWAP_BEHAVIOR_PRESERVED: true
D AndroidRuntime: Shutting down VM
E AndroidRuntime: FATAL EXCEPTION: main
E AndroidRuntime: Process: com.godot.multimesh, PID: 14078
E AndroidRuntime: java.lang.RuntimeException: Unable to start activity ComponentInfo{com.godot.multimesh/com.godot.game.GodotApp}: java.lang.IllegalStateException: Unable to initialize engine render view
E AndroidRuntime: 	at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2416)
E AndroidRuntime: 	at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2476)
E AndroidRuntime: 	at android.app.ActivityThread.-wrap11(ActivityThread.java)
E AndroidRuntime: 	at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1344)
E AndroidRuntime: 	at android.os.Handler.dispatchMessage(Handler.java:102)
E AndroidRuntime: 	at android.os.Looper.loop(Looper.java:148)
E AndroidRuntime: 	at android.app.ActivityThread.main(ActivityThread.java:5417)
E AndroidRuntime: 	at java.lang.reflect.Method.invoke(Native Method)
E AndroidRuntime: 	at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726)
E AndroidRuntime: 	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)
E AndroidRuntime: Caused by: java.lang.IllegalStateException: Unable to initialize engine render view
E AndroidRuntime: 	at org.godotengine.godot.GodotFragment.performEngineInitialization(GodotFragment.java:196)
E AndroidRuntime: 	at org.godotengine.godot.GodotFragment.onCreate(GodotFragment.java:182)
E AndroidRuntime: 	at androidx.fragment.app.Fragment.performCreate(Fragment.java:2949)
E AndroidRuntime: 	at androidx.fragment.app.FragmentStateManager.create(FragmentStateManager.java:475)
E AndroidRuntime: 	at androidx.fragment.app.FragmentStateManager.moveToExpectedState(FragmentStateManager.java:278)
E AndroidRuntime: 	at androidx.fragment.app.FragmentManager.executeOpsTogether(FragmentManager.java:2189)
E AndroidRuntime: 	at androidx.fragment.app.FragmentManager.removeRedundantOperationsAndExecute(FragmentManager.java:2100)
E AndroidRuntime: 	at androidx.fragment.app.FragmentManager.execSingleAction(FragmentManager.java:1971)
E AndroidRuntime: 	at androidx.fragment.app.BackStackRecord.commitNowAllowingStateLoss(BackStackRecord.java:311)
E AndroidRuntime: 	at org.godotengine.godot.GodotActivity.onCreate(GodotActivity.kt:84)
E AndroidRuntime: 	at com.godot.game.GodotApp.onCreate(GodotApp.java:45)
E AndroidRuntime: 	at android.app.Activity.performCreate(Activity.java:6251)
E AndroidRuntime: 	at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1107)
E AndroidRuntime: 	at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2369)
E AndroidRuntime: 	... 9 more

The problem is "Unable to initialize engine render view" at GodotFragment.java:196

I'm assuming you are testing on master in which case, you could actually be seeing a recently introduced issue that was fixed yesterday by #88938.

I'm working on the 4.2.1-stable branch now, to avoid any other extra issues.

@joined72
Copy link
Contributor Author

joined72 commented Mar 1, 2024

The problem is "Unable to initialize engine render view" at GodotFragment.java:196

Problem solved, was caused from my wrong configuration!

About the MeshInstance2D and MultiMeshInstance2D issue, I found that the MultiMeshInstance2D works fine ONLY if is NOT used the custom color (custom data is OK). I'm working on it...

@joined72
Copy link
Contributor Author

joined72 commented Mar 1, 2024

I found other useful info... the problem is in this part of the drivers/gles3/shaders/canvas.glsl sources:

The following line:

vec4 instance_color = vec4(unpackHalf2x16(instance_color_custom_data.x), unpackHalf2x16(instance_color_custom_data.y));

returns ALWAYS instance_color.z and instance_color.w equal to zero!

I'm continuing my investigation...

@joined72
Copy link
Contributor Author

joined72 commented Mar 1, 2024

Other info...
If I change the following drivers/gles3/shaders/canvas.glsl source code:

FROM:

vec4 instance_color = vec4(unpackHalf2x16(instance_color_custom_data.x), unpackHalf2x16(instance_color_custom_data.y));
color *= instance_color;

TO:

vec4 instance_color = vec4(unpackHalf2x16(instance_color_custom_data.x), unpackHalf2x16(instance_color_custom_data.y));
instance_color.w = 1.0;
instance_color.z = 1.0;
color *= instance_color;

I force BLUE and ALPHA channels ALWAYS to 100% but I CAN SEE CPUParticles2D, but ONLY if I use a NOT empty Texture!

@joined72
Copy link
Contributor Author

joined72 commented Mar 1, 2024

GREAT NEWS! ;)

I found a solution. Changing the following drivers/gles3/shaders/canvas.glsl source code:

FROM:

if (bool(read_draw_data_flags & FLAGS_INSTANCING_HAS_COLORS)) {
	vec4 instance_color = vec4(unpackHalf2x16(instance_color_custom_data.x), unpackHalf2x16(instance_color_custom_data.y));
	color *= instance_color;
}
if (bool(read_draw_data_flags & FLAGS_INSTANCING_HAS_CUSTOM_DATA)) {
	instance_custom = vec4(unpackHalf2x16(instance_color_custom_data.z), unpackHalf2x16(instance_color_custom_data.w));
}

TO:

if (bool(read_draw_data_flags & FLAGS_INSTANCING_HAS_COLORS)) {
	vec4 instance_color;
	instance_color.xy = unpackHalf2x16(instance_color_custom_data.x);
	instance_color.zw = unpackHalf2x16(instance_color_custom_data.y);
	color *= instance_color;
}
if (bool(read_draw_data_flags & FLAGS_INSTANCING_HAS_CUSTOM_DATA)) {
	instance_custom.xy = unpackHalf2x16(instance_color_custom_data.z);
	instance_custom.zw = unpackHalf2x16(instance_color_custom_data.w);
}

FIX ANY CPUParticles2D bugs (works fine with and without Texture)!

@clayjohn what do you think about?

If you think these are acceptable changes then the following piece of code would also need to be modified, although I don't know exactly what it does.

if (gl_VertexID % 3 == 0) {
	vertex = read_draw_data_point_a;
	uv = read_draw_data_uv_a;
	color = vec4(unpackHalf2x16(read_draw_data_color_a_rg), unpackHalf2x16(read_draw_data_color_a_ba));
} else if (gl_VertexID % 3 == 1) {
	vertex = read_draw_data_point_b;
	uv = read_draw_data_uv_b;
	color = vec4(unpackHalf2x16(read_draw_data_color_b_rg), unpackHalf2x16(read_draw_data_color_b_ba));
} else {
	vertex = read_draw_data_point_c;
	uv = read_draw_data_uv_c;
	color = vec4(unpackHalf2x16(read_draw_data_color_c_rg), unpackHalf2x16(read_draw_data_color_c_ba));
}

NOTE: all these changes did not fix the GPUParticles2D issue (particles aren't showed at all).

@joined72
Copy link
Contributor Author

joined72 commented Mar 4, 2024

@clayjohn from my research, it seems that the problem is that the drivers/gles3/shaders/particles_copy.glsl shader, receives incorrect parameters (specifically velocity_flags), in fact by commenting on the following line, the particles appear even if they do not move.

// if (bool(floatBitsToUint(velocity_flags.w) & PARTICLE_FLAG_ACTIVE))

I tried to figure out who passed the parameters to particles_copy.glsl shader, but I couldn't.

I thought it was passed to it by the particles.glsl, but by changing its out_ parameters, particles_copy.glsl still gets the wrong ones.

@joined72
Copy link
Contributor Author

joined72 commented Mar 6, 2024

@clayjohn
I managed to activate the GPUParticles2D (with some limitations not yet resolved) by making the following changes:

particles.glsl

-	flags = floatBitsToUint(velocity_flags.w);
+	flags = PARTICLE_FLAG_ACTIVE; // On Adreno 3XX 'velocity_flags.w' is ALWAYS ZERO!
...
-	out_velocity_flags.w = uintBitsToFloat(flags);
+	out_velocity_flags.w = float(flags);

particles_copy.glsl

-	if (bool(floatBitsToUint(velocity_flags.w) & PARTICLE_FLAG_ACTIVE)) {
+	if (bool(uint(velocity_flags.w) & PARTICLE_FLAG_ACTIVE)) {`

What do you think about?

I have testes these changes on NOT Adreno 3XX devices and works fine (also forcing flags = PARTICLE_FLAG_ACTIVE;, if I disable the particles emitting flag, the particles continue to BE DISABLED).

@joined72
Copy link
Contributor Author

joined72 commented Mar 6, 2024

From the Chat...

I don't think we can set flags = PARTICLE_FLAG_ACTIVE every cycle without breaking something. There are many cases where the flag should be inactive, even though emitting is true. Like if you use particle lifetime randomness or explosiveness

@clayjohn
My problem is that I still can't figure out who initializes velocity_flags.w in the particles.glsl shader.

I understand that out_velocity_flags of particles.glsl is passed as input parameter to particles_copy.glsl, but I still don't understand who is initializing the velocity_flags input parameter of particles.glsl.

If I could understand this well, I think I could understand why, in Adreno 3XX devices, this parameter is always ZERO.

EDIT: I think is possible that the particles.glsl velocity_flags initialization, is done from the particles_storage.cpp sources, but I cannot find the exact line.

EDIT2: I think to have found the line here:

glEnableVertexAttribArray(1); // .xyz: velocity. .z: flags.
glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, stride, CAST_INT_TO_UCHAR_PTR(stride * lifetime_split + sizeof(float) * 4 * 1));

I'm working on it ;)

@joined72 joined72 force-pushed the gles3_gpuparticles2d_fixes branch 2 times, most recently from 98fe6c9 to 254438a Compare March 12, 2024 09:30
@joined72 joined72 changed the title Fix GPUParticles2D Freeze on Compatibility Rendering (GLES3) on Adreno 3XX devices. Fix CPU/GPUParticles2D bugs on Compatibility Rendering (GLES3) on Adreno 3XX devices. Mar 12, 2024
@joined72
Copy link
Contributor Author

joined72 commented Mar 12, 2024

A few days ago I received a Sony Xperia M2 (Adreno 305), which seems to be one of the oldest devices in the Adreno 3XX family.

I bought it to check if my fixes were sufficient even for such an old device and I discovered that they were NOT.

I therefore had to do more research to get MultiMeshInstance2D and CPUParticles2D working on Adreno 305 as well.

What I discovered is that such old Adreno drivers have the following extra bugs:

  • expose some GLES 310 functions on GLES 300 (more than less obsolete drivers already did)
  • these incorrectly exposed functions ARE BUGGED
  • if you call functions present in an included file, but inside an #ifdef ... #endif block, they give a compilation error like 'assign' : cannot convert from 'const float' to '2-component vector of float'

To fix all these bugs I had to rename all the functions of stdlib_inc.glsl shader (adding the godot_ prefix), and remove the #ifdef USE_GLES_OVER_GL ... #endif directive.

@clayjohn can you take a look and tell me what do you think, please?

@joined72
Copy link
Contributor Author

joined72 commented Mar 16, 2024

I'm running some benchmarks to test the new godot_ functions speed.

From my tests, results that calling the godot_ functions about 100.000 times/sec there isn't any performance impact.

Test results (100.000 calls/secs):

  • Huawei Honor 7A - performance lost 0.9%
  • Samsung Galaxy Note 4 - performance lost 0.25%
  • LG G4 - performance lost 0%
  • Xiaomi Redmi 8 - performance lost 0%
  • Xiaomi Redmi 10 - performance lost 0%
  • Xiaomi Redmi 12 - performance lost 0%
  • Huawei Honor 9 Lite - performance lost 0%

There is a really minimal performance loss only on very old devices but we have to consider that we are calling these functions 100.000 times/sec.

I think the driver implementation is very similar to the godot_ one! :)

@joined72 joined72 force-pushed the gles3_gpuparticles2d_fixes branch 5 times, most recently from 1f3ca0e to e7176fb Compare March 20, 2024 16:55
@joined72
Copy link
Contributor Author

joined72 commented Mar 20, 2024

@clayjohn

Currently this PR brings the following fixes:

  1. fixes the freeze caused by the presence of GPUParticles2D
  2. fix CPUParticles2D and MultiMeshes (now they work perfectly)

Considering that the number of bugs to fix GPUParticles2D on Adreno 3XX devices increases (I just identified a new bug/freeze in the scene/resources/particle_process_material.cpp code), and considering how slow the Adreno 3XX family devices are in processing GPUParticles2D (I can say this because even if the particles are not displayed, they are still processed and a notable decrease in FPS of at least 11/13 FPS for simple particles can be measured), and considering how many fixes this PR brings on all Adreno 3XX devices (even on very old devices, like Adreno 305) and how much time was required to debug only the part relating to GPUParticles2D, without finding any fixes, I decided, at least for the moment, to disable GPUParticles2D at all on any Adreno 3XX device.

Can I ask for the review required for the merge, please? ;)

drivers/gles3/storage/particles_storage.cpp Outdated Show resolved Hide resolved
drivers/gles3/storage/particles_storage.cpp Outdated Show resolved Hide resolved
drivers/gles3/shaders/stdlib_inc.glsl Show resolved Hide resolved
drivers/gles3/shaders/stdlib_inc.glsl Outdated Show resolved Hide resolved
@joined72 joined72 requested a review from clayjohn April 2, 2024 09:57
Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now! Great investigative work

@akien-mga akien-mga merged commit b4b4919 into godotengine:master Apr 4, 2024
16 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPUParticles2D on Compatibility Rendering (GLES3) freeze on Adreno 3XX devices
5 participants