Skip to content

Conversation

chescock
Copy link
Contributor

Objective

Stop using ArchetypeComponentId in the executor. These IDs will grow even more quickly with relations, and the size may start to degrade performance.

Solution

Have systems expose their FilteredAccessSet<ComponentId>, and have the executor use that to determine which systems conflict. This can be determined statically, so determine all conflicts during initialization and only perform bit tests when running.

Testing

I ran many_foxes and didn't see any performance changes. It's probably worth testing this with a wider range of realistic schedules to see whether the reduced concurrency has a cost in practice, but I don't know what sort of test cases to use.

Migration Guide

The schedule will now prevent systems from running in parallel if there could be an archetype that they conflict on, even if there aren't actually any. For example, these systems will now conflict even if no entity has both Player and Enemy components:

fn player_system(query: Query<(&mut Transform, &Player)>) {}
fn enemy_system(query: Query<(&mut Transform, &Enemy)>) {}

To allow them to run in parallel, use Without filters, just as you would to allow both queries in a single system:

// Either one of these changes alone would be enough
fn player_system(query: Query<(&mut Transform, &Player), Without<Enemy>>) {}
fn enemy_system(query: Query<(&mut Transform, &Enemy), Without<Player>>) {}

@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times labels Dec 18, 2024
@alice-i-cecile
Copy link
Member

Ah, so we're moving from factual to hypothetical conflicts here. Interesting. I think that this is the right direction, to reduce pain with relations, and also to allow us to precompute more about the schedule ordering ahead of time in the future.

@alice-i-cecile alice-i-cecile added X-Contentious There are nontrivial implications that should be thought through S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 18, 2024
@hymm hymm self-requested a review December 18, 2024 19:15
@hymm
Copy link
Contributor

hymm commented Dec 24, 2024

Many components stress test

Times below are the update schedule times in ms. Negative changes mean this pr is slower.

entities components systems a-c ids pr main change
50000 100 100 240028 1.93 1.9 -1.55%
50000 100 1600 240016 39.32 24.32 -38.15%
50000 2000 100 246549 0.138 0.169 22.46%
50000 2000 1600 246732 2.92 3.17 8.56%
1000000 100 100 4576062 62.63 62.93 0.48%
1000000 100 1600 4576124 1750 1320 -24.57%
1000000 2000 100 4881460 4.03 6.28 55.83%
1000000 2000 1600 4881633 112.91 131.56 16.52%

Note: We see that the frame time decreases as the number of components grows. This happens because each system ends up matching fewer entities and so has less work to do.

Trying to interpret these results

We see there are some regressions on the 100 components-1600 systems results. This is somewhat expected as these are a worst case senario for this pr. We end up with fewer systems that can run in parallel because of the more pessimistic access check. We can see this in the below tracing spans. This pr is much more sparse than main.

main:
image

this pr;
image

We do also see some improvements. This seems to be due to there not being too many new conflicts between systems and the systems spawning faster and starting another thread or two. It helps when there is enough work for all the threads, but when there isn't starting extra threads hurts like for the 50000-100-100 results.

Overall I think these results are as expected. We lose some parallelism, but gain some speed spawning system tasks due to cheaper checks when there's a large number of archetype components. I'll probably investigate a little more with varying the number of system to get a better idea of how things scale as the number of systmes increases. But this doesn't answer the question of "What does a more realistic schedule look like?" I would expect conflicts to be low or nonexistant, since those should get detected as ambiguities between systems and then get ordered.

I did also ran many_foxes and changes seemed to be within noise. Some runs ended up slightly slower and some ran slightly faster.

Some future work could be to remove conflicts between systems that are already ordered

@alice-i-cecile
Copy link
Member

I would expect conflicts to be low or nonexistant, since those should get detected as ambiguities between systems and then get ordered.

Fully agree with this analysis, and great benchmarking overall. If we had as-needed ordering (which would check archetype-component access) I'd quibble with you, but as it stands, production projects are going to be very averse to the types of parallelism that the old checks enables and these checks fail.

Based on my conversations with users, I think that in practice our current parallelism is too fine-grained for realistic projects. Swapping to single-threaded schedules for your main update loop shouldn't be speeding things up! Both the checks and the system task dispatch are too expensive for all but the heaviest systems. We can help improve the former here at least.

@alice-i-cecile alice-i-cecile added this to the 0.16 milestone Dec 24, 2024
@alice-i-cecile
Copy link
Member

@inodentry, if you could test this PR on one or two of your projects I'd be very curious in the relative change in performance. I know scheduling overhead has been a bugbear of yours, and this should help improve it a bit (in addition to clearing a blocker for relations).

@hymm
Copy link
Contributor

hymm commented Dec 24, 2024

Note that this pr is going to conflict with #16784. I think my preference is to merge this pr first as this pr will avoid the important perf regressions in that pr. Any other perf regressions should just be at schedule build time.

@chescock
Copy link
Contributor Author

Note that this pr is going to conflict with #16784.

Just to clarify: They won't cause a merge conflict in git, right? I think the only interaction is that this PR mitigates some of the cost of that one.

We can do even better by completely removing ArchetypeComponentId! I didn't want to do that in this PR because it would make it larger, so systems are still spending time populating their Access<ArchetypeComponentId>. But those sets are no longer used, and we can remove them in a follow-up.

Some future work could be to remove conflicts between systems that are already ordered

Yup! I think we'd want to move the calculation out of the executor and into the schedule graph to do that, since that's where we have the transitive ordering available. That would also let the ambiguity checker share these more precise checks! It currently ignores filters, so it reports things that I consider false positives.

@alice-i-cecile
Copy link
Member

That would be very nice. It would be great if we could standardize on one "check if things conflict" mechanism, both for simplicity and for teaching.

@hymm
Copy link
Contributor

hymm commented Dec 27, 2024

Note that this pr is going to conflict with #16784.

Just to clarify: They won't cause a merge conflict in git, right? I think the only interaction is that this PR mitigates some of the cost of that one.

I was confused. I though that pr modified the access checks in the multithreaded executor too. But looking at it again, it doesn't seem to conflict.

@hymm
Copy link
Contributor

hymm commented Dec 27, 2024

I'm not in a hurry to merge this pr, since it seems like we'll be getting non fragmenting relations in the short term instead of the fragmenting version. Ideally, we'd get some testing from users before merging this to make sure this doesn't regress things.

@BenjaminBrienen BenjaminBrienen added the D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes label Jan 22, 2025
…component-access

# Conflicts:
#	crates/bevy_ecs/src/schedule/executor/mod.rs
#	crates/bevy_ecs/src/schedule/executor/multi_threaded.rs
@alice-i-cecile alice-i-cecile removed this from the 0.16 milestone Feb 25, 2025
chescock added 2 commits March 7, 2025 13:22
…component-access

# Conflicts:
#	crates/bevy_ecs/src/schedule/executor/mod.rs
#	crates/bevy_ecs/src/schedule/executor/multi_threaded.rs
#	crates/bevy_ecs/src/system/observer_system.rs
#	crates/bevy_ecs/src/system/schedule_system.rs
…component-access

# Conflicts:
#	crates/bevy_ecs/src/schedule/executor/multi_threaded.rs
@notmd
Copy link
Contributor

notmd commented Apr 22, 2025

Isn't using ComponentId you can build the topology graph a head of time and removing runtime check entirely?

@chescock
Copy link
Contributor Author

Isn't using ComponentId you can build the topology graph a head of time and removing runtime check entirely?

There are definitely interesting things worth testing in this space! I made some attempts at building a static graph ahead of time, but I wasn't able to find anything that improved performance on the schedules I was testing.

I'd like to keep this PR focused on doing a minimal change to remove the dependency on ArchetypeComponentId, though.

@alice-i-cecile alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Apr 29, 2025
@alice-i-cecile alice-i-cecile added this pull request to the merge queue May 5, 2025
@alice-i-cecile
Copy link
Member

I think this is a better default, and it unlocks a lot of functionality. As we clean up the executor code, we should re-add a archetype-parallel executor for people to compare against.

Merged via the queue into bevyengine:main with commit 55bb59b May 5, 2025
34 checks passed
@chescock
Copy link
Contributor Author

chescock commented May 7, 2025

As we clean up the executor code, we should re-add a archetype-parallel executor for people to compare against.

Oh! I had been planning a follow-up PR to completely remove ArchetypeComponentId, since it's unused after this PR. That would cut out a bunch of time and space calculating and storing the access. I didn't want to do it as part of this PR because it touches a lot of code and would collect merge conflicts, so I thought we should merge the controversial part first. But if we're planning to re-add an executor that uses ArchetypeComponentId, then maybe we don't want to remove it!

Should I create a PR to remove ArchetypeComponentId, or should I wait?

@alice-i-cecile
Copy link
Member

Remove it! That's a very strong argument.

andrewzhurov pushed a commit to andrewzhurov/bevy that referenced this pull request May 17, 2025
# Objective

Stop using `ArchetypeComponentId` in the executor. These IDs will grow
even more quickly with relations, and the size may start to degrade
performance.

## Solution

Have systems expose their `FilteredAccessSet<ComponentId>`, and have the
executor use that to determine which systems conflict. This can be
determined statically, so determine all conflicts during initialization
and only perform bit tests when running.

## Testing

I ran many_foxes and didn't see any performance changes. It's probably
worth testing this with a wider range of realistic schedules to see
whether the reduced concurrency has a cost in practice, but I don't know
what sort of test cases to use.

## Migration Guide

The schedule will now prevent systems from running in parallel if there
*could* be an archetype that they conflict on, even if there aren't
actually any. For example, these systems will now conflict even if no
entity has both `Player` and `Enemy` components:
```rust
fn player_system(query: Query<(&mut Transform, &Player)>) {}
fn enemy_system(query: Query<(&mut Transform, &Enemy)>) {}
```

To allow them to run in parallel, use `Without` filters, just as you
would to allow both queries in a single system:
```rust
// Either one of these changes alone would be enough
fn player_system(query: Query<(&mut Transform, &Player), Without<Enemy>>) {}
fn enemy_system(query: Query<(&mut Transform, &Enemy), Without<Player>>) {}
```
github-merge-queue bot pushed a commit that referenced this pull request May 27, 2025
# Objective

Remove `ArchetypeComponentId` and `archetype_component_access`.
Following #16885, they are no longer used by the engine, so we can stop
spending time calculating them or space storing them.

## Solution

Remove `ArchetypeComponentId` and everything that touches it.  

The `System::update_archetype_component_access` method no longer needs
to update `archetype_component_access`. We do still need to update query
caches, but we no longer need to do so *before* running the system. We'd
have to touch every caller anyway if we gave the method a better name,
so just remove `System::update_archetype_component_access` and
`SystemParam::new_archetype` entirely, and update the query cache in
`Query::get_param`.

The `Single` and `Populated` params also need their query caches updated
in `SystemParam::validate_param`, so change `validate_param` to take
`&mut Self::State` instead of `&Self::State`.
Shatur added a commit to simgine/bevy_replicon that referenced this pull request Sep 28, 2025
- State now initialized separately.
- Updating archetype access no longer needed.
- `new_archetype` no longer exists and archetype cache needs to be
  updated in `get_param`.
- `ReplicationRules` resource no longer need to be cloned.

For details see bevyengine/bevy#16885
bevyengine/bevy#19143
Shatur added a commit to simgine/bevy_replicon that referenced this pull request Sep 30, 2025
- State now initialized separately.
- Updating archetype access no longer needed.
- `new_archetype` no longer exists and archetype cache needs to be
  updated in `get_param`.
- `ReplicationRules` resource no longer need to be cloned.

For details see bevyengine/bevy#16885
bevyengine/bevy#19143
Shatur added a commit to simgine/bevy_replicon that referenced this pull request Oct 1, 2025
- State now initialized separately.
- Updating archetype access no longer needed.
- `new_archetype` no longer exists and archetype cache needs to be
  updated in `get_param`.
- `ReplicationRules` resource no longer need to be cloned.

For details see bevyengine/bevy#16885
bevyengine/bevy#19143
Shatur added a commit to simgine/bevy_replicon that referenced this pull request Oct 1, 2025
- State now initialized separately.
- Updating archetype access no longer needed.
- `new_archetype` no longer exists and archetype cache needs to be
  updated in `get_param`.
- `ReplicationRules` resource no longer need to be cloned.

For details see bevyengine/bevy#16885
bevyengine/bevy#19143
Shatur added a commit to simgine/bevy_replicon that referenced this pull request Oct 2, 2025
- State now initialized separately.
- Updating archetype access no longer needed.
- `new_archetype` no longer exists and archetype cache needs to be
  updated in `get_param`.
- `ReplicationRules` resource no longer need to be cloned.

For details see bevyengine/bevy#16885
bevyengine/bevy#19143
Shatur added a commit to simgine/bevy_replicon that referenced this pull request Oct 3, 2025
* Bump Bevy version
* Migrate to the new Entity layout
For details see bevyengine/bevy#19121
bevyengine/bevy#18704
The niche is now in the index, which makes the compression logic even
simpler.
* Migrate to the new `SystemParam` changes
For details see bevyengine/bevy#16885
bevyengine/bevy#19143
* Remove `*_trigger_targets`
* Simplify fns logic
We no longer need custom sed/de to additionally serialize targets. This
allows us to express things a bit nicer using conversion traits.
* Rename all "event" into "message".
* Rename all "trigger" into "event".
* Rename "resend locally" into just "send locally"
Fits better.
* Split channel methods

---------

Co-authored-by: UkoeHB <37489173+UkoeHB@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it X-Contentious There are nontrivial implications that should be thought through
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants