Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize transform propagation for hidden 3d objects #45583

Merged
merged 1 commit into from
Feb 6, 2021

Conversation

lawnjelly
Copy link
Member

@lawnjelly lawnjelly commented Jan 30, 2021

When visual instances are hidden, there is no need to update their global transforms and send these to the visual server. This only needs to be done when the objects are reshown.

Explanation

I'm currently looking at optimizing performance of hidden objects, as I've noticed they surprisingly can slow down frame rates significantly. While #45576 deals with the case of static objects, there is still a performance problem when hidden 3d objects are moved.

Benchmarks:

0 objects
6500 fps

10,000 objects, hidden and static (with #45576)
5500 fps

10,000 objects, hidden but moved each frame via moving parent node
158 fps

10,000 objects, hidden but moved each frame via moving parent node, with this PR
360 fps

Clearly the drop with dynamic objects is very significant, and should in theory be totally unnecessary.

Profiling revealed a lot of time was spent calling get_global_transform() and sending the transform to the visual server.

This example PR keeps a local flag for visibility state (something we should probably be doing anyway, as is_visible_in_tree can be expensive), doesn't update these transforms when the node is not visible.

This increases fps to 360 fps. While an improvement it is still not as good as can be. This is a profile with this PR applied. Still there is a lot of iterations of the propagation code (12,000000).

And actually the BVH seems to be taking some time which shouldn't be happening, will investigate - ah .. this is a false problem, it is because this PR is without #45576 applied which cures that. 😁

dynamic_visibility_profile

Ideally though we would like to be able to get to a hidden node, and disable propagation of transforms below that node in the scene tree. This is not (afaik) currently supported, presumably because even though child nodes of a hidden node may not need to be visible, they still may need the transform (e.g. physics objects).

I'm currently investigating whether it might be worth us having a single, or set of disable modes to prevent propagation of transforms, or perhaps other propagated features.

Any comments welcome. The flags BTW is because it is clear from the node source that there are a number of bools that are intended to be converted to flags but no one has got round to it, so I'm potentially getting the ball rolling on that idea. You could either have access functions for the flags or access the data directly to use bit operations, whichever is preferable. Maybe the latter actually having seen this pattern used in other parts of Godot. Or we could use a bool in this PR, and leave all the conversions to flags in another PR - whichever is preferred.

@Calinou Calinou added this to the 3.2 milestone Jan 30, 2021
@lawnjelly lawnjelly force-pushed the optimize_transform_propagate branch 7 times, most recently from b312488 to 9632162 Compare February 3, 2021 15:05
@lawnjelly lawnjelly changed the title [WIP] Optimize transform propagation for hidden 3d objects Optimize transform propagation for hidden 3d objects Feb 3, 2021
@lawnjelly lawnjelly requested a review from reduz February 3, 2021 15:07
@lawnjelly lawnjelly marked this pull request as ready for review February 3, 2021 15:07
@lawnjelly
Copy link
Member Author

lawnjelly commented Feb 3, 2021

Moved this from draft now as I think it should be a good idea, as it gives a free speed boost and is fairly non-intrusive. That is on the assumption that the visual server doesn't need transforms updated for hidden objects, need to get reduz to confirm this. If he thinks this is the case we can do something similar in master.

EDIT - reduz says that this is not required in Godot 4.0 as it works differently in this area. And confirms that the visual server shouldn't need transforms of hidden objects, so that sounds as though this is a sensible change.

@akien-mga akien-mga requested a review from a team February 3, 2021 15:12
When visual instances are hidden, there is no need to update their global transforms and send these to the visual server. This only needs to be done when the objects are reshown.
@lawnjelly lawnjelly force-pushed the optimize_transform_propagate branch from 9632162 to cb8134f Compare February 3, 2021 15:17
(data.parent && !data.toplevel_active) ?
data.parent->get_global_transform().affine_inverse() * p_transform :
p_transform;
Transform xform = (data.parent && !data.toplevel_active) ? data.parent->get_global_transform().affine_inverse() * p_transform : p_transform;

Copy link
Member Author

@lawnjelly lawnjelly Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave up here and put it all on one line. clang format seems to consistently produce a line breaking result which fails the CI checks for this. Clang format seems to give 6 tabs and 2 spaces, whereas the static checks want 5 tabs and 2 spaces? I thought it was my IDE but now not so sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might need to upgrade to clang-format 11 to get the same results as the CI. But yeah, clang-format seems to have troubles handling long ternary operators in a consistent way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I'll try the clang-format upgrade, see if it helps! 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format-11 gives 8 tabs. Maybe something they haven't quite got sorted yet! 😁

@hpvb
Copy link
Member

hpvb commented Feb 3, 2021

As I said on rocket.chat: This seems like the correct approach for 3.2! 👍

@Calinou
Copy link
Member

Calinou commented Feb 3, 2021

Ideally though we would like to be able to get to a hidden node, and disable propagation of transforms below that node in the scene tree. This is not (afaik) currently supported, presumably because even though child nodes of a hidden node may not need to be visible, they still may need the transform (e.g. physics objects).

Node2D and Spatial have a toplevel property (top_level in master). If true, the node won't inherit from its parents' transforms. Can this property be taken into account for this optimization?

However, as you state, there's no "parent" counterpart that would make a node's children ignore the parent' transform automatically.

@lawnjelly
Copy link
Member Author

lawnjelly commented Feb 3, 2021

I'll just have a little look to see whether this can be done visual server side, as reduz says he prefers that if possible. 👍 Same kind of idea, just done on the other side of the call.

On the visual server side the function is:

void VisualServerScene::instance_set_transform(RID p_instance, const Transform &p_transform) {

	Instance *instance = instance_owner.get(p_instance);
	ERR_FAIL_COND(!instance);

	if (instance->transform == p_transform)
		return; //must be checked to avoid worst evil

#ifdef DEBUG_ENABLED

	for (int i = 0; i < 4; i++) {
		const Vector3 &v = i < 3 ? p_transform.basis.elements[i] : p_transform.origin;
		ERR_FAIL_COND(Math::is_inf(v.x));
		ERR_FAIL_COND(Math::is_nan(v.x));
		ERR_FAIL_COND(Math::is_inf(v.y));
		ERR_FAIL_COND(Math::is_nan(v.y));
		ERR_FAIL_COND(Math::is_inf(v.z));
		ERR_FAIL_COND(Math::is_nan(v.z));
	}

#endif
	instance->transform = p_transform;
	_instance_queue_update(instance, true);
}

This could easily become something like:

void VisualServerScene::instance_set_transform(RID p_instance, const Transform &p_transform) {

	Instance *instance = instance_owner.get(p_instance);
	ERR_FAIL_COND(!instance);

	if (instance->transform == p_transform)
		return; //must be checked to avoid worst evil

	instance->transform = p_transform;

// if hidden return

Although this is a little neater, the downside is the efficiency. In the visual_instance:

		case NOTIFICATION_TRANSFORM_CHANGED: {
				Transform gt = get_global_transform();
				VisualServer::get_singleton()->instance_set_transform(instance, gt);
		} break;

get_global_transform is problematic. By doing it client side we avoid get_global_transform() which can set up one or more matrix transforms being applied needlessly. In addition we save the function call and saving the transform in the visual server, which aren't required.

So we can easily do in the visual server function, but the caveat is it will be doing needless extra work. Either approach is ok, I would lean on the side of client side still, because I think 3.2 can really benefit from any increases in performance we can give. But admittedly the code is simpler having it in the visual server. I'll try some timings using both approaches to compare.

@lawnjelly
Copy link
Member Author

Node2D and Spatial have a toplevel property (top_level in master). If true, the node won't inherit from its parents' transforms. Can this property be taken into account for this optimization?

I didn't know this property. I'm not absolutely sure if it can help.

However, as you state, there's no "parent" counterpart that would make a node's children ignore the parent' transform automatically.

The wider problem comes down to this:

  • you have a parent, which may have 10,000 children which are all hidden, none of which need their transforms updating.
  • You get a propagate transform notification .. it gets sent to 10,000 children (slow)
  • 10,000 children either decide they are hidden and do no else (this PR)
  • or 10,000 children update their global transform by doing matrix multiplies from the parent, and send this transform to the visual server, which stores it, and then does nothing

Where ideally:

  • you have a parent, which has 10,000 children all of which are hidden
  • you get a propagate transform notification. You don't send to any children. Done.

In the general case you can't take this latter approach, because you don't know in advance which are hidden, and whether some are nodes that require transform even when hidden (e.g. physics). If you query 10,000 nodes to find this info, you've also lost a load of performance, just by hitting all those nodes unnecessarily.

So ideally you'd want some kind of flags to be able to stop propagating certain things:

  • transforms
  • process
  • physics_process
  • etc

I think this is partly what is in this proposal:
godotengine/godot-proposals#1835

Another approach would be to maintain multiple lists on each node, a list of children, and a list of children that need specific updates, and have the children register / unregister. But this becomes quite complex to the point it's not worth it. You could also cache a list of children that need specific updates, and just set these to dirty when things change. That is maybe more feasible.

Having disable flags is probably the most efficient of all but it requires user to take responsibility, whereas some of the others are automatic methods.

There's also the question of where you draw the line with these optimizations, the benefit has to be worth the added code complexity. It is a bit of a trade off. Godot normally errs on the side of simple maintainable code at the cost of some performance.

@lawnjelly
Copy link
Member Author

lawnjelly commented Feb 3, 2021

Ok I've benchmarked both methods now, with 10,000 moving hidden objects as before.

Current Godot 3.2 dev
156 fps

This PR
395 fps

Visual Server approach
207 fps

As I suspected the speed of this PR has increased now that the BVH visibility PR #45576 has been merged.

What is also clear is that the client side approach is not just a bit faster, it is a lot faster than bailing out in the visual server. So on balance I am inclined to think this is worth the slight less elegance over visual server approach. What do you guys think?

@akien-mga akien-mga merged commit f05e068 into godotengine:3.2 Feb 6, 2021
@akien-mga
Copy link
Member

Thanks!

@lawnjelly lawnjelly deleted the optimize_transform_propagate branch February 6, 2021 11:30
@akien-mga akien-mga modified the milestones: 3.2, 3.3 Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants