Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option for software skinning in MeshInstance #40313

Merged
merged 1 commit into from
Oct 8, 2020

Conversation

pouleyKetchoupp
Copy link
Contributor

@pouleyKetchoupp pouleyKetchoupp commented Jul 12, 2020

Option in MeshInstance to enable software skinning, in order to test against the current USE_SKELETON_SOFTWARE path which causes problems with bad performance.

Co-authored by @lawnjelly

More details in #37696.

Remaining things to do around this feature:

@volzhs
Copy link
Contributor

volzhs commented Jul 12, 2020

skeleton test.zip
here is a sample project for playing same animation at the same time with duplicated instances.

skeleton_without_software_skinning
This is expected behavior (without software skinning)

skeleton_with_software_skinning
This is what happens with software skinning

@clayjohn
Copy link
Member

@volzhs does that issue still occur if you select the mesh and click "make unique"?

It looks like the skinning takes place in the MeshInstance, but it alters the mesh itself, so if multiple MeshInstances share a mesh, they will also share an animation.

@volzhs
Copy link
Contributor

volzhs commented Jul 12, 2020

@clayjohn "Make Sub-Resources Unique" on mesh does not make a difference.

@pouleyKetchoupp pouleyKetchoupp force-pushed the software-skinning-test-3.2 branch from cc11895 to ee57de3 Compare July 13, 2020 12:41
@pouleyKetchoupp
Copy link
Contributor Author

@volzhs I've just pushed an update that should fix your issue with instances playing the same animation. Let me know if everything still works fine for you when you have a chance to test it.

@volzhs
Copy link
Contributor

volzhs commented Jul 13, 2020

@pouleyKetchoupp yes! it definitely fixed my issue. 👍

@volzhs
Copy link
Contributor

volzhs commented Jul 13, 2020

oh... unfortunately, I found another issue.
meshes of skeleton became invisible when changing mesh of MeshInstance

mesh_instance.mesh = load(another_mesh_path)

and got error when trying to update its material

mesh_instance.material_override = material
E 0:00:03.936   instance_set_surface_material: Index p_surface = 0 is out of bounds (instance->materials.size() = 0).
  <C++ Source>  servers/visual/visual_server_scene.cpp:642 @ instance_set_surface_material()

I found workaround.

mesh_instance.mesh = load(another_mesh_path)
mesh_instance.software_skinning = false
mesh_instance.software_skinning = true
mesh_instance.material_override = material

now all meshes of skeleton are visible again and no error with updating material.

@pouleyKetchoupp
Copy link
Contributor Author

Ok, the issues with changing meshes and materials shouldn't be too hard to fix, I just haven't spent enough time to check different cases yet.

I'm updating the PR description to add the remaining things to do.

@pouleyKetchoupp pouleyKetchoupp force-pushed the software-skinning-test-3.2 branch from ee57de3 to e2d196b Compare July 13, 2020 18:50
@pouleyKetchoupp
Copy link
Contributor Author

@lawnjelly I've just added you as co-author since you've been helping with the code itself.

@lawnjelly
Copy link
Member

@lawnjelly I've just added you as co-author since you've been helping with the code itself.

Oh that's ok I didn't do much. Might be able to help in some of the optimization though.

Some other things for the todo list:

  • I think we need a project setting that will switch between the fallback using software skinning or the USE_SKELETON_SOFTWARE path (if the fast hardware path is available, it probably makes sense to still use that. We could possibly consider an override later on, as software skinning does seem to be faster with several lights, but that would need lots of testing). I can help have a look at this if you are not familiar.
  • With the normals I think I mentioned we could do with a flag to determine whether to transform them, probably defaulting to on. I had a look for an existing flag structure and there was 'kind of' one, using the ability to set a number of bits for a bool. It may make sense to revisit whether this is a good idea and whether it should be replaced by conventional bitflags.
  • Tangents for normal mapping may or may not be a possible issue, I haven't looked how godot handles these, perhaps @clayjohn might know.

@pouleyKetchoupp
Copy link
Contributor Author

@lawnjelly Thanks! It's all added to the todo list.

For switching the fallback between USE_SKELETON_SOFTWARE and the new software skinning, I would love some help :) I'm not so sure where to handle it, and whether it means moving the new software skinning code somewhere deeper on the render side.

@lawnjelly
Copy link
Member

lawnjelly commented Jul 14, 2020

@lawnjelly Thanks! It's all added to the todo list.

For switching the fallback between USE_SKELETON_SOFTWARE and the new software skinning, I would love some help :) I'm not so sure where to handle it, and whether it means moving the new software skinning code somewhere deeper on the render side.

I think we could simply add a 'can_hardware_skin' boolean function to the renderers, to be called after they initialize. The renderer detects the hardware caps (this is already done to select the USE_SKELETON_SOFTWARE path). If the function returns false, we automatically select software skinning when the mesh instance is setup.

I'll try and add this to my branch / make a PR to yours.

@lawnjelly
Copy link
Member

lawnjelly commented Jul 14, 2020

Ok it turned out it was very simple to get querying the renderer working, so simple I'll just write it here:
In bool RasterizerStorageGLES2::has_os_feature(const String &p_feature) (rasterizer_storage_gles2.cpp), add the line:

// could be any sensible string, doesn't have to be this
if (p_feature == "skinning_fallback")
	return config.use_skeleton_software;

use_skeleton_software is set when the hardware doesn't have the capabilities to use the main skinning method. This at present will use the SKELETON_SOFTWARE path, but we can override this if we choose to use software skinning instead.

Then in mesh_instance.cpp add this line where required (could be done in _initialize_skinning, or you could use a static once off variable for all mesh instances):

// this could alternatively be done as a once off, after renderer creation
bool request_software_skinning = VSG::storage->has_os_feature("skinning_fallback");

You will also need to add this to the mesh_instance.cpp headers:

#include "servers/visual/visual_server_globals.h"

This is assuming that visual server globals is okay to use from here, and isn't better called indirectly (I'll check that with reduz when we get around to it, as there is an indirection layer in the visual server for threading but it shouldn't be much more involved). The arvr_interface_gdnative seems to also use it in this way, but that's no guarantee it is best.

@lawnjelly
Copy link
Member

lawnjelly commented Jul 14, 2020

For project settings, add to visual_server.cpp, line 2435:

	GLOBAL_DEF("rendering/quality/skinning/software_skinning_fallback", false);
	GLOBAL_DEF("rendering/quality/skinning/force_software_skinning", false);

These can be accessed in mesh_instance.cpp with

	bool  software_fallback = GLOBAL_GET("rendering/quality/skinning/software_skinning_fallback");
	bool force_software = GLOBAL_GET("rendering/quality/skinning/force_software_skinning");

Also add this header to mesh_instance.cpp

#include "core/project_settings.h"

All these strings are likely to be slow, so that increases the case for deciding all this as a once off at startup. Not quite sure where the best place for this might be (possibly outside mesh_instance itself), as it is just some setup and storing a fast access bool.. @clayjohn any ideas?

@lawnjelly
Copy link
Member

lawnjelly commented Jul 14, 2020

BTW the latest change you made to the skinning loop resulted in a decrease in FPS from 580fps to 440fps, it is probably interfering with the ability to auto-vectorize, so I've swapped it back to the original in my branch:

			int b0 = bone_ids[0];
			int b1 = bone_ids[1];
			int b2 = bone_ids[2];
			int b3 = bone_ids[3];

			Transform transform;
			transform.origin =
					bone_weights[0] * bone_transforms[b0].origin +
					bone_weights[1] * bone_transforms[b1].origin +
					bone_weights[2] * bone_transforms[b2].origin +
					bone_weights[3] * bone_transforms[b3].origin;

			transform.basis =
					bone_transforms[b0].basis * bone_weights[0] +
					bone_transforms[b1].basis * bone_weights[1] +
					bone_transforms[b2].basis * bone_weights[2] +
					bone_transforms[b3].basis * bone_weights[3];

I'll have a look at a couple of optimization approaches.

@lawnjelly
Copy link
Member

lawnjelly commented Jul 14, 2020

Did a first look at some optimization. Pre-creating the weights in a linear buffer (instead of reading them from the write buffer) gave a modest increase (560 - 580fps), but I'm not sure it would be worth it with the extra memory use. Removing the branches for the compressed weights and compressed indices didn't yield much.

Had a little look at normals, I don't think it will slow things down that much (580fps -> 504fps), however what is less than ideal is that the default format stores normals compressed (8 bit per channel it looks like). It is possible to decompress them, transform then recompress, but it might be better to force the vertex format to use uncompressed normals when using software skinning, which would be more accurate and might be a bit faster.

If we were going to support the compressed normal format, SIMD intrinsics might be worth exploring because it has saturated math. In contrast I suspect the main transforms will already be auto-vectorized with SSE2, at least on x86_64, although we might benefit from CPU detection and intrinsics on ARMv7 where neon is optional, on ARMv8 it is mandated so should auto-vectorize. And it is highly likely that x86 machines will support the hardware path anyway.

Depending on the model it is possible that looping over the vertex by bone could conceivably be quicker, although it would need testing, I've seen this approach used in the past. It looked unlikely to be quicker if vertices were averaging 4 bones, however in models with a lot of single / double weighted verts it could be interesting to try, though it could potentially be less cache friendly in large models.

@pouleyKetchoupp
Copy link
Contributor Author

@lawnjelly Thanks for all the detailed feedback! I'll try to find some time for an update in the coming days.

Concerning normal compression, it could be as simple as what we're doing for vertices (format &= ~Mesh::ARRAY_COMPRESS_VERTEX in _initialize_skinning). Since a new dynamic mesh is created in the case of software skinning, it's easy enough to remove the compression just for this case.

@pouleyKetchoupp pouleyKetchoupp force-pushed the software-skinning-test-3.2 branch from e2d196b to ab08c46 Compare July 18, 2020 15:27
@pouleyKetchoupp
Copy link
Contributor Author

I've just pushed a new version with:

  • Automatic fallback (enabled by default in GLES2 to replace USE_SKELETON_SOFTWARE path)
  • Options to disable fallback or force software in project settings
  • Now the MeshInstance property is force_software_skinning to override project settings
  • Support for transforming normals (enabled by default) & option to disable it in project settings
  • Bone transform allocation on stack using alloca

@lawnjelly @clayjohn Are there more features needed like tangent transformation, or some extra optimization you would like to try?

@volzhs I couldn't reproduce your issue with setting the mesh and material after my last changes. Could you please try with this version and provide me with a project if you can still reproduce it?

@lawnjelly
Copy link
Member

lawnjelly commented Jul 18, 2020

Normals

I just had a quick look and a couple of points about transforming the normal... It is potentially a little different to transforming the vertex positions.

First you may only need to transform by the rotation (i.e. the 3x3 basis). This is worth testing, I don't really know the shader in detail, it's kind of hard to follow, it seems to be transforming by the whole matrix too...

As an example, if you have a normal pointing to the left (-1, 0, 0) then apply a translate with no rotation (4, 3, 0), the normal is now (3, 3, 0) i.e. pointing top right. Which doesn't sound right.

Second we should technically be using the transpose of the inverse of the matrix I believe:
https://paroj.github.io/gltut/Illumination/Tut09%20Normal%20Transformation.html

It so happens that this is the same as the rotation matrix in cases of uniform scale (i.e. the inverse is the transpose), but it may not hold for non uniform scale. From a look at the shader it looks like this is being assumed, so I guess this is okay (but interestingly means that normals in the engine are probably currently wrong with non uniform scale!).

Settings

I was wondering whether having a per mesh setting to decide whether to transform normals might be useful, with a default to transform them. I suppose we could have a project setting with 3 enum values - force transform all normals, use per mesh settings, or force no transform for all.

Setting whether to transform using software skinning per mesh I wasn't so sure about. I think in practice it will usually be an all or nothing thing, not sure on others views about that.

@pouleyKetchoupp
Copy link
Contributor Author

Normals:
It's probably fine as long as the limitation is the same as in the shader, so I'll keep it this way for now if it's ok with you.

Settings:
I agree forcing software on specific meshes might not be useful when thinking about it.

Maybe for now we could have only global settings, including for transforming normals? I would rather start simple as long as we don't have specific use cases that require finer settings.

I can make these project settings enum type with Enabled/Disabled so we can add a third entry "Per-mesh settings" if needed in some scenarios in the future without breaking compatibility. Do you think it would make sense?

@lawnjelly
Copy link
Member

Normals:
It's probably fine as long as the limitation is the same as in the shader, so I'll keep it this way for now if it's ok with you.

When I get a spare moment I'll try and check this out properly and see if there are any problems with the normals on moving the meshes. It could be that it's fine, but would be nice to confirm this.

Settings:
I agree forcing software on specific meshes might not be useful when thinking about it.

Maybe for now we could have only global settings, including for transforming normals? I would rather start simple as long as we don't have specific use cases that require finer settings.

I can make these project settings enum type with Enabled/Disabled so we can add a third entry "Per-mesh settings" if needed in some scenarios in the future without breaking compatibility. Do you think it would make sense?

Yup sure. Although sometimes when things don't get added on the first go they get forgotten (or it becomes more difficult to get the political will to change) on a subsequent PR. On the other hand having skinning options appear on all mesh instances (that might not be skinned) could be confusing.

@clayjohn
Copy link
Member

@lawnjelly @pouleyKetchoupp We have 2 paths for transforming normals. By default we do a simple transform that breaks with non-uniform scaling. But we also have the option to use a proper transform if users need non-uniform scaling and are okay with the performance cost

#if defined(ENSURE_CORRECT_NORMALS)
mat3 normal_matrix = mat3(transpose(inverse(modelview)));
normal = normal_matrix * normal;
#else
normal = normalize((modelview * vec4(normal, 0.0)).xyz);
#endif
#if defined(ENABLE_TANGENT_INTERP) || defined(ENABLE_NORMALMAP) || defined(LIGHT_USE_ANISOTROPY)
tangent = normalize((modelview * vec4(tangent, 0.0)).xyz);
binormal = normalize((modelview * vec4(binormal, 0.0)).xyz);
#endif
#endif

Settings

IMO forcing software skeleton should be a project setting, but ensuring correct normals should be a per-mesh setting. This way it lines up with the way we have ensure_correct_normals in both SpatialMaterials, and Shaders (as a render_mode).

@pouleyKetchoupp pouleyKetchoupp force-pushed the software-skinning-test-3.2 branch from 47c58a1 to 835182e Compare October 8, 2020 10:54
@pouleyKetchoupp
Copy link
Contributor Author

It's updated. For the crash, I've just added a check in _initialize_skinning for the mesh validity.

@lawnjelly
Copy link
Member

Looks good to me apart from those minor tweaks I mentioned. I would double check is_visible_in_tree functionality works ok when not in the scene tree and doesn't cause a crash.

Can't check it out at runtime as I'm away from home but am satisfied from previous versions that it should run good. 👍

Option in MeshInstance to enable software skinning, in order to test
against the current USE_SKELETON_SOFTWARE path which causes problems
with bad performance.

Co-authored-by: lawnjelly <lawnjelly@gmail.com>
@pouleyKetchoupp pouleyKetchoupp force-pushed the software-skinning-test-3.2 branch from 835182e to f954471 Compare October 8, 2020 14:15
@pouleyKetchoupp
Copy link
Contributor Author

@lawnjelly I've made the changes you requested and tested visibility on and off on parent nodes and on mesh instances, everything seems fine.

Copy link
Member

@lawnjelly lawnjelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, unless clayjohn / akien can see anything else that needs changing.

Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with it! Let's get it merged!

@akien-mga akien-mga merged commit 0cea273 into godotengine:3.2 Oct 8, 2020
@akien-mga
Copy link
Member

Thanks a ton to all involved! :)

@lawnjelly
Copy link
Member

@pouleyKetchoupp : I've tracked down the regression causing the materials not to be working, more by luck than judgement, but if you look in _resolve_skeleton_path(), this is called whether or not the mesh instance is using a skeleton.

The old code has a if (skin_ref.is_valid()) call before it does anything, but the new code does not.

If you place a wrapper here:

	if (skin_ref.is_valid())
	{
		software_skinning_flags &= ~SoftwareSkinning::FLAG_BONES_READY;

		_initialize_skinning();
	}

The regression is gone. Essentially I think it is treating non skinned meshes as though they were.

I hesitate to make a PR though, because really I'm unfamiliar with the plumbing of this PR, only the skinning function, so I'll leave it up to you to fix.

pouleyKetchoupp added a commit to nekomatata/godot that referenced this pull request Oct 16, 2020
This fixes a regression from PR godotengine#40313 (support for software skinning in MeshInstance).

Before, the base mesh was always updated on load even if not skinning
was used, which caused mesh instance materials to be reset on the
rendering side.

Now the base mesh is set only when it has been modified, or when
switching software skinning on or off. In this case the mesh instance
materials are always updated properly afterwards.
@lawnjelly
Copy link
Member

Just a note of something that just occurred to me last night because it is so 'obvious' we may have missed it .. does the skeleton update take account of culling? If not we should fix this. (If not then this is an existing problem with skeleton update, but becomes more pressing with software skinning.)

@pouleyKetchoupp
Copy link
Contributor Author

@lawnjelly Good call! It doesn't look like Skeleton does anything like that, so it would be a great next step for 3d optimization in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants