Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize Transform Propagation #4697

Closed
james7132 opened this issue May 8, 2022 · 1 comment
Closed

Parallelize Transform Propagation #4697

james7132 opened this issue May 8, 2022 · 1 comment
Labels
A-Hierarchy Parent-child entity hierarchies C-Feature A new feature, making something new possible C-Performance A change motivated by improving speed, memory usage or compile times

Comments

@james7132
Copy link
Member

What problem does this solve or what need does it fill?

Transform propagation can get very slow for very large scenes and deep hierarchies. Make it faster.

What solution would you like?

When investigating the performance of transform_propagate_system for #4203, one of the potential options that came up is to chunk up propagation based on the hierarchy roots and run the system in parallel. Using Query::par_for_each_mut as a replacement for single-threaded iteration allows the system to leverage the full ComputeTaskPool for very large and deep hierarchies. However, due to the &mut GlobalTransform, the query for descendant entities cannot be Clone, and thus requires the unsafe Query::get_unchecked to get child entities. This is sound if and only if the hierarchy is strictly a tree, which requires every child in the hierarchy to be globally unique. Unfortunately there is currently no way to ensure this assumption holds. This is mitigable by having a parallel lock that panics on contention.

On my local machine, this saw roughly a 4x speed up on the transform_hierarchy -- humanoid_mixed stress test, going from 8.1 ms per frame to 1.88 ms, a greater than 4x speedup, which may suggest this use of unsafe code may be worth it, provided the assumptions shown hold true.

Here's the resultant code form this experiment:
https://github.com/james7132/bevy/blob/1e7ad38da9d8ea51542b585b3ef1ed76927357f3/crates/bevy_transform/src/systems.rs#L42=

What alternative(s) have you considered?

The proposed solution above has a few drawbacks:

  • unsafe code in userspace code (bevy_transform)
  • A GlobalTransformLock component is visible in userspace ECS. Perhaps a generic Lock<T: Component>?

Adding dynamically lockable components directly into ECS is a potential extension of this idea, and keeps unsafe out of userspace code. There was a brief discussion on Discord about this: https://discord.com/channels/691052431525675048/749335865876021248/972888139783872543

@james7132 james7132 added C-Feature A new feature, making something new possible S-Needs-Triage This issue needs to be labelled labels May 8, 2022
@TheRawMeatball TheRawMeatball added C-Performance A change motivated by improving speed, memory usage or compile times A-Hierarchy Parent-child entity hierarchies and removed S-Needs-Triage This issue needs to be labelled labels May 8, 2022
@james7132
Copy link
Member Author

As noted later in that Discord discussion, there isn't a need for locks, specifically for the hierarchy case, to ensure unique access. Only a child.parent == this check is necessary to ensure that each child only has one parent, and thus must be globally unique within the hierarchy.

However, for this to work without constantly panicking, the hierarchy must be consistent at all times.

bors bot pushed a commit that referenced this issue Nov 21, 2022
# Objective
Fixes #4697. Hierarchical propagation of properties, currently only Transform -> GlobalTransform, can be a very expensive operation. Transform propagation is a strict dependency for anything positioned in world-space. In large worlds, this can take quite a bit of time, so limiting it to a single thread can result in poor CPU utilization as it bottlenecks the rest of the frame's systems.

## Solution

 - Move transforms without a parent or a child (free-floating (Global)Transform) entities into a separate parallel system.
 - Chunk the hierarchy based on the root entities and process it in parallel with `Query::par_for_each_mut`. 
 - Utilize the hierarchy's specific properties introduced in #4717 to allow for safe use of `Query::get_unchecked` on multiple threads. Assuming each child is unique in the hierarchy, it is impossible to have an aliased `&mut GlobalTransform` so long as we verify that the parent for a child is the same one propagated from.

---

## Changelog
Removed: `transform_propagate_system` is no longer `pub`.
@bors bors bot closed this as completed in eaeba08 Nov 21, 2022
taiyoungjang pushed a commit to taiyoungjang/bevy that referenced this issue Dec 15, 2022
# Objective
Fixes bevyengine#4697. Hierarchical propagation of properties, currently only Transform -> GlobalTransform, can be a very expensive operation. Transform propagation is a strict dependency for anything positioned in world-space. In large worlds, this can take quite a bit of time, so limiting it to a single thread can result in poor CPU utilization as it bottlenecks the rest of the frame's systems.

## Solution

 - Move transforms without a parent or a child (free-floating (Global)Transform) entities into a separate parallel system.
 - Chunk the hierarchy based on the root entities and process it in parallel with `Query::par_for_each_mut`. 
 - Utilize the hierarchy's specific properties introduced in bevyengine#4717 to allow for safe use of `Query::get_unchecked` on multiple threads. Assuming each child is unique in the hierarchy, it is impossible to have an aliased `&mut GlobalTransform` so long as we verify that the parent for a child is the same one propagated from.

---

## Changelog
Removed: `transform_propagate_system` is no longer `pub`.
alradish pushed a commit to alradish/bevy that referenced this issue Jan 22, 2023
# Objective
Fixes bevyengine#4697. Hierarchical propagation of properties, currently only Transform -> GlobalTransform, can be a very expensive operation. Transform propagation is a strict dependency for anything positioned in world-space. In large worlds, this can take quite a bit of time, so limiting it to a single thread can result in poor CPU utilization as it bottlenecks the rest of the frame's systems.

## Solution

 - Move transforms without a parent or a child (free-floating (Global)Transform) entities into a separate parallel system.
 - Chunk the hierarchy based on the root entities and process it in parallel with `Query::par_for_each_mut`. 
 - Utilize the hierarchy's specific properties introduced in bevyengine#4717 to allow for safe use of `Query::get_unchecked` on multiple threads. Assuming each child is unique in the hierarchy, it is impossible to have an aliased `&mut GlobalTransform` so long as we verify that the parent for a child is the same one propagated from.

---

## Changelog
Removed: `transform_propagate_system` is no longer `pub`.
ItsDoot pushed a commit to ItsDoot/bevy that referenced this issue Feb 1, 2023
# Objective
Fixes bevyengine#4697. Hierarchical propagation of properties, currently only Transform -> GlobalTransform, can be a very expensive operation. Transform propagation is a strict dependency for anything positioned in world-space. In large worlds, this can take quite a bit of time, so limiting it to a single thread can result in poor CPU utilization as it bottlenecks the rest of the frame's systems.

## Solution

 - Move transforms without a parent or a child (free-floating (Global)Transform) entities into a separate parallel system.
 - Chunk the hierarchy based on the root entities and process it in parallel with `Query::par_for_each_mut`. 
 - Utilize the hierarchy's specific properties introduced in bevyengine#4717 to allow for safe use of `Query::get_unchecked` on multiple threads. Assuming each child is unique in the hierarchy, it is impossible to have an aliased `&mut GlobalTransform` so long as we verify that the parent for a child is the same one propagated from.

---

## Changelog
Removed: `transform_propagate_system` is no longer `pub`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Hierarchy Parent-child entity hierarchies C-Feature A new feature, making something new possible C-Performance A change motivated by improving speed, memory usage or compile times
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants