Resource tracking overhead #1413

kvark · 2021-05-28T21:08:51Z

Related to #1411
Firefox profile - https://share.firefox.dev/3fvytEz
If you look at run_render_pass_impl in the flamegraph view, the Vulkan driver is taking less than half of that time.
The rest is spent on barriers and tracking (everything related to TrackerSet), so clearly this is a hot spot.

Ideas:

Most buffers and textures are immutable after they got the initial contents transferred into them. It would be wonderful to find a way to avoid tracking them entirely, or at least up to the point where they do get mutated.
Today the tracker has (init, current) states per subresource. We could extend it to (init, current, next), so that the sync scope (of, say, a render pass) will be accumulated in the next. This would mean - no need to allocate/free tracker sets per pass. Everything would be done right in the command buffer tracker. Related to Scope-based usage tracking in the render pass #443

The text was updated successfully, but these errors were encountered:

kvark · 2021-05-29T01:12:29Z

Assembly for the following functions need to be looked at under a microscope:

gfx_backend_vulkan::command::CommandBuffer::bind_descriptor_sets: roughly 1/9 of it is actually spent in ash::vk::features::DeviceFnV1_0::cmd_bind_descriptor_sets. The rest includes boilerplate an inlined pipeline compatibility logic.
wgpu_core::command::CommandBuffer::insert_barriers seems quite heavy, also does some refcount stuff (unexpectedly?)
wgpu_core::command::bind::Binder::assign_group does refcounting. I don't think it should?
wgpu_core::track::TrackerSet::merge_extend and wgpu_core::track::ResourceTracker::use_extend: maybe we can make them cache friendlier?
wgpu_core::hub::Storage::get shows up a bit everywhere, but a lot in submit()

kvark · 2021-05-31T17:15:25Z

Another idea (3) - a merge of 1) and 2) but without any heavy changes. If we know something doesn't have a state (but just needs to be added to the lifetime tracker), then adding it to the render pass tracker is a waste. We could make it so only buffers and textures are tracked per pass (or per usage scope), and everything else goes straight to the command buffer tracker. For the animometer benchmark, it would cut the costs by almost the factor of 2.

1417: Split the tracker into stateful/stateless to reduce the overhead r=cwfitzgerald a=kvark **Connections** Implements #1413 (comment) Reduces the overhead for resource tracking in the Animometer benchmark by up to 50%. **Description** We used to use the full tracker set on the usage scopes associated with compute/render passes. A resource tracker has 2 responsibilities: ensuring the resource is held alive, and validating and recording the state transitions. This PR exploits the fact that the latter responsibility is only applicable for buffers and textures. So doing all the lifetime tracking for a pass is a waste: we can instead just attach the lifetimes to the parent command buffer, straight. In the Animometer benchmark, there is one large buffer, and thousands of bind groups pointing to different offsets into it. The old code would fill up the pass tracker with those bind groups, and then merge it into the command buffer tracker. The new code would just fill up the command buffer tracker instead. Since there is only one buffer, the pass tracking becomes much lighter. **Testing** Untested. It would be nice to have some benchmarks here, possibly after #1397 ? Co-authored-by: Dzmitry Malyshau <kvarkus@gmail.com>

kvark · 2021-06-04T16:08:59Z

New profile after (3) is merged - https://share.firefox.dev/3pjq2Qa
There are still things to address here. Also there is a gap not annotated by anything. Run render pass doing something heavy?

cwfitzgerald · 2022-06-06T04:12:35Z

Closing after #2662

kvark added type: enhancement New feature or request help required We need community help to make this happen. area: performance How fast things go labels May 28, 2021

kvark mentioned this issue May 29, 2021

Default resource usage semantics #1414

Closed

kvark mentioned this issue May 31, 2021

Split the tracker into stateful/stateless to reduce the overhead #1417

Merged

cwfitzgerald closed this as completed Jun 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource tracking overhead #1413

Resource tracking overhead #1413

kvark commented May 28, 2021

kvark commented May 29, 2021

kvark commented May 31, 2021

kvark commented Jun 4, 2021 •

edited

Loading

cwfitzgerald commented Jun 6, 2022

Resource tracking overhead #1413

Resource tracking overhead #1413

Comments

kvark commented May 28, 2021

kvark commented May 29, 2021

kvark commented May 31, 2021

kvark commented Jun 4, 2021 • edited Loading

cwfitzgerald commented Jun 6, 2022

kvark commented Jun 4, 2021 •

edited

Loading