Requiring mutable access to several common shared types prevents queue phase parallelization. #3548

james7132 · 2022-01-04T20:32:47Z

What problem does this solve or what need does it fill?

Query<&mut RenderPhase<T>>, RenderPipelineCache, SpecializedPipelines (to a lesser degree) requires mutable access to use. Each of these can limit the parallelism of the queue phase when rendering.

Query<&mut RenderPhase<T>> limits it to one system per phase at a time. Generally allows parallelism across mutliple phases at the same time, unless there is one system that requires access to multiple render phases.
RenderPipelineCache limits any dependent pipeline from parallelizing.
SpecializedPipelines limits parallelization across multiple phases within a single extract-prepare-queue pipeline. This is less of an issue than the other two.

What solution would you like?

Merge SpecializedPipelines functionality into RenderPipelines and make it generic over a SpecializedPipeline. This removes the global pipeline cache as a blocking requirement. Might also improve the ergonomics of writing those kinds of systems.
Switch any/all of the associated types to use internal mutability. Most notably RenderPhase<T> might benefit the most from this. Adding mutexes may be too much of a cost here. Perhaps a good place to try out lock-free data structures? Worth benchmarking.

What alternative(s) have you considered?

Recreate render pipelines on every executions.
Limit time in each queue job by aggressively batching it's inputs in the prepare phase.

The text was updated successfully, but these errors were encountered:

# Objective Partially addresses #3548. `queue_shadows` and `queue_material_meshes` cannot parallelize because of the `ResMut<RenderMeshInstances>` parameter for `queue_material_meshes`. ## Solution Change the `material_bind_group` field to use atomics instead of needing full mutable access. Change the `ResMut` to a `Res`, which should allow both sets of systems to parallelize without issue. ## Performance Tested against `many_foxes`, this has a significant improvement over the entire render schedule. (Yellow is this PR, red is main) ![image](https://github.com/bevyengine/bevy/assets/3137680/6cc7f346-4f50-4f12-a383-682a9ce1daf6) The use of atomics does seem to have a negative effect on `queue_material_meshes` (roughly a 8.29% increase in time spent in the system). ![image](https://github.com/bevyengine/bevy/assets/3137680/7907079a-863d-4760-aa5b-df68c006ea36) `queue_shadows` seems to be ever so slightly slower (1.6% more time spent) in the system. ![image](https://github.com/bevyengine/bevy/assets/3137680/6d90af73-b922-45e4-bae5-df200e8b9784) `batch_and_prepare_render_phase` seems to be a mix, but overall seems to be slightly *faster* by about 5%. ![image](https://github.com/bevyengine/bevy/assets/3137680/fac638ff-8c90-436b-9362-c6209b18957c)

# Objective Partially addresses bevyengine#3548. `queue_shadows` and `queue_material_meshes` cannot parallelize because of the `ResMut<RenderMeshInstances>` parameter for `queue_material_meshes`. ## Solution Change the `material_bind_group` field to use atomics instead of needing full mutable access. Change the `ResMut` to a `Res`, which should allow both sets of systems to parallelize without issue. ## Performance Tested against `many_foxes`, this has a significant improvement over the entire render schedule. (Yellow is this PR, red is main) ![image](https://github.com/bevyengine/bevy/assets/3137680/6cc7f346-4f50-4f12-a383-682a9ce1daf6) The use of atomics does seem to have a negative effect on `queue_material_meshes` (roughly a 8.29% increase in time spent in the system). ![image](https://github.com/bevyengine/bevy/assets/3137680/7907079a-863d-4760-aa5b-df68c006ea36) `queue_shadows` seems to be ever so slightly slower (1.6% more time spent) in the system. ![image](https://github.com/bevyengine/bevy/assets/3137680/6d90af73-b922-45e4-bae5-df200e8b9784) `batch_and_prepare_render_phase` seems to be a mix, but overall seems to be slightly *faster* by about 5%. ![image](https://github.com/bevyengine/bevy/assets/3137680/fac638ff-8c90-436b-9362-c6209b18957c)

james7132 added C-Feature A new feature, making something new possible S-Needs-Triage This issue needs to be labelled labels Jan 4, 2022

james7132 changed the title ~~Requiring mutable access to several types prevents queue phase parallelization.~~ Requiring mutable access to several common shared types prevents queue phase parallelization. Jan 4, 2022

alice-i-cecile added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times and removed S-Needs-Triage This issue needs to be labelled C-Feature A new feature, making something new possible labels Jan 4, 2022

james7132 mentioned this issue Dec 10, 2022

Improve Queue Phase parallelization and other small optimizations #4899

Closed

james7132 mentioned this issue Feb 10, 2024

Allow Mesh-related queue phase systems to parallelize #11804

Merged

james7132 mentioned this issue Feb 19, 2024

Enable queue phase parallelization by using Parallel to back RenderPhase #11984

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requiring mutable access to several common shared types prevents queue phase parallelization. #3548

Requiring mutable access to several common shared types prevents queue phase parallelization. #3548

james7132 commented Jan 4, 2022 •

edited

Loading

Requiring mutable access to several common shared types prevents queue phase parallelization. #3548

Requiring mutable access to several common shared types prevents queue phase parallelization. #3548

Comments

james7132 commented Jan 4, 2022 • edited Loading

What problem does this solve or what need does it fill?

What solution would you like?

What alternative(s) have you considered?

james7132 commented Jan 4, 2022 •

edited

Loading