Add GpuArrayBuffer and BatchedUniformBuffer #8204

JMS55 · 2023-03-25T01:56:01Z

Objective

Add a type for uploading a Rust Vec<T> to a GPU array<T>.
Makes progress towards GPU Instancing #89.

Solution

Port @superdump's BatchedUniformBuffer to bevy main, as a fallback for WebGL2, which doesn't support storage buffers.
- Rather than getting an array<T> in a shader, you get an array<T, N>, and have to rebind every N elements via dynamic offsets.
Add GpuArrayBuffer to abstract over StorageBuffer<Vec<T>>/BatchedUniformBuffer.

Future Work

Add a shader macro kinda thing to abstract over the following automatically: #8204 (review)

Changelog

Added GpuArrayBuffer, GpuComponentArrayBufferPlugin, GpuArrayBufferable, and GpuArrayBufferIndex types.
Added DynamicUniformBuffer::new_with_alignment().

robtfm

code and comment quality is good. the code broadly looks right but i really need an example to kick it around properly. i don't think the feature needs an example in the repo but maybe you have something you've been using while building it that i could look at?

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs

crates/bevy_render/src/render_resource/gpu_list.rs

JMS55 · 2023-03-27T03:15:50Z

I have some messy code here that uses GpuList for MeshUniforms. If it's not helpful, I can write up a separate example. The next step after this PR will be to use GpuList for mesh MeshUniforms, and then materials.

JMS55 · 2023-03-29T18:19:03Z

Reminder to myself that I need to add robswain to the commit authors.

Clippy lint

JMS55 · 2023-03-31T23:24:24Z

I kind of fixed the commit history... good enough ig?

superdump

Just a few small changes. Otherwise LGTM. I look forward to using this. :)

One nice to have though I'm not sure how to do it - it would be useful to have a nice shader abstraction for it. I suppose the shader side will be either:

var<uniform> my_list: array<T, #{T_BATCH_SIZE}>;

or

var<storage> my_list: array<T>;

but either way, the array will be indexed into, so the interface should be the same. So we need a way of setting the binding type to either uniform or storage, and if uniform then setting the array size. Currently I think one could do that like this:

#ifdef T_BATCH_SIZE
var<uniform> my_list: array<T, #{T_BATCH_SIZE}u>;
#else
var<storage> my_list: array<T>;
#endif

superdump · 2023-04-23T13:38:04Z

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs

+#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, PartialOrd, Ord)]
+struct MaxCapacityArray<T>(T, usize);
+
+impl<T> ShaderType for MaxCapacityArray<T>
+where
+    T: ShaderType<ExtraMetadata = ArrayMetadata>,
+{
+    type ExtraMetadata = ArrayMetadata;
+
+    const METADATA: Metadata<Self::ExtraMetadata> = T::METADATA;
+
+    fn size(&self) -> ::core::num::NonZeroU64 {
+        Self::METADATA.stride().mul(self.1.max(1) as u64).0
+    }
+}
+
+impl<T> WriteInto for MaxCapacityArray<T>
+where
+    T: WriteInto + RuntimeSizedArray,
+{
+    fn write_into<B: BufferMut>(&self, writer: &mut Writer<B>) {
+        debug_assert!(self.0.len() <= self.1);
+        self.0.write_into(writer);
+    }
+}


This code was written by @teoxoy so we need to add credit for them to the commit that introduces it.

Thanks! If this is ready for production I can merge the branch in encase and do a release.
Let me know!

It works fine for us. :) There is that other aspect of being able to start the next dynamic offset binding of a uniform buffer at the next dynamic offset alignment if not all space is used, and ensure that the final binding is full-size. I don't know if that would clash with this and basically immediately deprecate this approach. If so maybe you'd prefer that we use a solution in bevy for what we need and add the long-term and more flexible solution to encase when someone gets to it. What do you think?

I won't block the PR on this. We can figure it out over time. :)

I tried to rebase to give credit on the original commit but due to merges it was a pain. I instead added a comment and a co-authored-by so that when the squash merge is done, the credit will follow along with it.

Ok, we can further iterate and see what we come up with. Thanks for the credit!

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs

crates/bevy_render/src/render_resource/gpu_list.rs

crates/bevy_render/src/render_resource/storage_buffer.rs

crates/bevy_render/src/render_resource/uniform_buffer.rs

JMS55 · 2023-04-24T22:20:33Z

One nice to have though I'm not sure how to do it - it would be useful to have a nice shader abstraction for it. I suppose the shader side will be either:

Sadly we don't have a macro system (nor am I sure if we'd want one, those are easy to abuse...), so I'm not sure how we'd do that. Maybe some kind of custom thing in our shader parser, like this:

var<gpu_list> my_list: GpuList<T>;

that the parser replaces with the appropriate stuff, before passing it to naga.

Co-authored-by: Robert Swain <robert.swain@gmail.com>

crates/bevy_render/src/render_resource/gpu_list.rs

superdump · 2023-05-01T16:34:01Z

JMS55 confirmed on Discord that I can commit the two outstanding proposals and rebase this PR to get it merged as they are busy with other things at the moment.

Co-authored-by: François <mockersf@gmail.com>

IceSentry · 2023-06-26T17:51:13Z

Haven't reviewed it yet, so might be missing context, but I do think I'd prefer GpuArrayBuffer. GpuList also confused me for the longest time when seeing the PR title and I didn't really look into it because I thought it was something about listing gpus.

JMS55 · 2023-06-26T18:24:06Z

Due to popular demand, GpuList -> GpuArrayBuffer. I think this PR is ready to go now :)

IceSentry

couple of tiny nits, but otherwise LGTM

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs

crates/bevy_render/src/render_resource/gpu_array_buffer.rs

Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>

superdump

Let's get this merged after this docs change, pending another approval.

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>

# Objective This is a minimally disruptive version of #8340. I attempted to update it, but failed due to the scope of the changes added in #8204. Fixes #8307. Partially addresses #4642. As seen in #8284, we're actually copying data twice in Prepare stage systems. Once into a CPU-side intermediate scratch buffer, and once again into a mapped buffer. This is inefficient and effectively doubles the time spent and memory allocated to run these systems. ## Solution Skip the scratch buffer entirely and use `wgpu::Queue::write_buffer_with` to directly write data into mapped buffers. Separately, this also directly uses `wgpu::Limits::min_uniform_buffer_offset_alignment` to set up the alignment when writing to the buffers. Partially addressing the issue raised in #4642. Storage buffers and the abstractions built on top of `DynamicUniformBuffer` will need to come in followup PRs. This may not have a noticeable performance difference in this PR, as the only first-party systems affected by this are view related, and likely are not going to be particularly heavy. --- ## Changelog Added: `DynamicUniformBuffer::get_writer`. Added: `DynamicUniformBufferWriter`.

# Objective This is a minimally disruptive version of bevyengine#8340. I attempted to update it, but failed due to the scope of the changes added in bevyengine#8204. Fixes bevyengine#8307. Partially addresses bevyengine#4642. As seen in bevyengine#8284, we're actually copying data twice in Prepare stage systems. Once into a CPU-side intermediate scratch buffer, and once again into a mapped buffer. This is inefficient and effectively doubles the time spent and memory allocated to run these systems. ## Solution Skip the scratch buffer entirely and use `wgpu::Queue::write_buffer_with` to directly write data into mapped buffers. Separately, this also directly uses `wgpu::Limits::min_uniform_buffer_offset_alignment` to set up the alignment when writing to the buffers. Partially addressing the issue raised in bevyengine#4642. Storage buffers and the abstractions built on top of `DynamicUniformBuffer` will need to come in followup PRs. This may not have a noticeable performance difference in this PR, as the only first-party systems affected by this are view related, and likely are not going to be particularly heavy. --- ## Changelog Added: `DynamicUniformBuffer::get_writer`. Added: `DynamicUniformBufferWriter`.

Add GpuList and BatchedUniformBuffer

40e9985

JMS55 added this to the 0.11 milestone Mar 25, 2023

JMS55 added C-Feature A new feature, making something new possible A-Rendering Drawing game state to the screen labels Mar 25, 2023

JMS55 requested a review from superdump March 25, 2023 01:56

Clippy lint

c11d4f2

superdump requested a review from robtfm March 25, 2023 04:46

robtfm reviewed Mar 26, 2023

View reviewed changes

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs Outdated Show resolved Hide resolved

crates/bevy_render/src/render_resource/gpu_list.rs Outdated Show resolved Hide resolved

superdump and others added 2 commits March 31, 2023 19:23

Add GpuList and BatchedUniformBuffer

8688472

Clippy lint

Merge branch 'gpu-list' of https://github.com/JMS55/bevy into gpu-list

1c94dcd

JMS55 mentioned this pull request Apr 1, 2023

Inefficient use of Vec::push in prepare_uniform_components function #8284

Closed

superdump requested changes Apr 23, 2023

View reviewed changes

JMS55 and others added 10 commits April 24, 2023 18:21

Update crates/bevy_render/src/render_resource/batched_uniform_buffer.rs

dc39eb5

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/gpu_list.rs

f88ccc4

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/gpu_list.rs

cdbbad2

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/gpu_list.rs

94a58ec

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/gpu_list.rs

c5357de

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/gpu_list.rs

70254d4

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/gpu_list.rs

7f7101d

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/storage_buffer.rs

66e48d7

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/storage_buffer.rs

00fb9b5

Co-authored-by: Robert Swain <robert.swain@gmail.com>

Update crates/bevy_render/src/render_resource/uniform_buffer.rs

3144b42

Co-authored-by: Robert Swain <robert.swain@gmail.com>

mockersf reviewed Apr 25, 2023

View reviewed changes

crates/bevy_render/src/render_resource/gpu_list.rs Outdated Show resolved Hide resolved

superdump and others added 2 commits May 1, 2023 18:34

Update crates/bevy_render/src/render_resource/gpu_list.rs

658568a

Co-authored-by: François <mockersf@gmail.com>

Update crates/bevy_render/src/render_resource/gpu_list.rs

eb93067

JMS55 added this to the 0.12 milestone Jun 11, 2023

JMS55 added 3 commits June 26, 2023 14:06

Merge commit '1e73312e49fc90479d8c9c645ffd85a59233067c' into gpu-list

7765c86

Lower MAX_REASONABLE_UNIFORM_BUFFER_BINDING_SIZE on WebGL2

9f4f027

Rename GpuList -> GpuArrayBuffer

973b8bb

JMS55 changed the title ~~Add GpuList and BatchedUniformBuffer~~ Add GpuArrayBuffer and BatchedUniformBuffer Jun 26, 2023

JMS55 added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jun 26, 2023

IceSentry approved these changes Jun 26, 2023

View reviewed changes

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs Show resolved Hide resolved

crates/bevy_render/src/render_resource/gpu_array_buffer.rs Outdated Show resolved Hide resolved

crates/bevy_render/src/render_resource/gpu_array_buffer.rs Show resolved Hide resolved

Update crates/bevy_render/src/render_resource/gpu_array_buffer.rs

d65cab4

Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>

superdump requested changes Jul 21, 2023

View reviewed changes

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs Outdated Show resolved Hide resolved

robtfm self-requested a review July 21, 2023 11:42

superdump mentioned this pull request Jul 21, 2023

GPU Instancing #89

Open

konsolas reviewed Jul 21, 2023

View reviewed changes

crates/bevy_render/src/render_resource/batched_uniform_buffer.rs Outdated Show resolved Hide resolved

robtfm approved these changes Jul 21, 2023

View reviewed changes

superdump and others added 2 commits July 21, 2023 18:16

Add internal documentation of BatchedUniformBuffer members

499e3a2

BatchedUniformBuffer: Optimize rounding code

df66b85

Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com>

superdump force-pushed the gpu-list branch from 868c4a8 to df66b85 Compare July 21, 2023 16:27

superdump approved these changes Jul 21, 2023

View reviewed changes

robtfm approved these changes Jul 21, 2023

View reviewed changes

konsolas approved these changes Jul 21, 2023

View reviewed changes

superdump added this pull request to the merge queue Jul 21, 2023

Merged via the queue into bevyengine:main with commit ad011d0 Jul 21, 2023

This was referenced Sep 20, 2023

Directly copy data into uniform buffers #9865

Merged

Directly copy data into uniform buffers #8340

Closed

james-j-obrien mentioned this pull request Oct 8, 2023

Implement triangle shape james-j-obrien/bevy_vector_shapes#30

Merged

cart mentioned this pull request Oct 13, 2023

News: Release 0.12 bevyengine/bevy-website#754

Merged

43 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GpuArrayBuffer and BatchedUniformBuffer #8204

Add GpuArrayBuffer and BatchedUniformBuffer #8204

JMS55 commented Mar 25, 2023 •

edited

Loading

robtfm left a comment

JMS55 commented Mar 27, 2023

JMS55 commented Mar 29, 2023

JMS55 commented Mar 31, 2023

superdump left a comment

superdump Apr 23, 2023

teoxoy Apr 24, 2023

superdump Apr 26, 2023

superdump May 1, 2023

superdump May 1, 2023

teoxoy May 2, 2023

JMS55 commented Apr 24, 2023

superdump commented May 1, 2023

IceSentry commented Jun 26, 2023

JMS55 commented Jun 26, 2023

IceSentry left a comment

superdump left a comment

Add GpuArrayBuffer and BatchedUniformBuffer #8204

Add GpuArrayBuffer and BatchedUniformBuffer #8204

Conversation

JMS55 commented Mar 25, 2023 • edited Loading

Objective

Solution

Future Work

Changelog

robtfm left a comment

Choose a reason for hiding this comment

JMS55 commented Mar 27, 2023

JMS55 commented Mar 29, 2023

JMS55 commented Mar 31, 2023

superdump left a comment

Choose a reason for hiding this comment

superdump Apr 23, 2023

Choose a reason for hiding this comment

teoxoy Apr 24, 2023

Choose a reason for hiding this comment

superdump Apr 26, 2023

Choose a reason for hiding this comment

superdump May 1, 2023

Choose a reason for hiding this comment

superdump May 1, 2023

Choose a reason for hiding this comment

teoxoy May 2, 2023

Choose a reason for hiding this comment

JMS55 commented Apr 24, 2023

superdump commented May 1, 2023

IceSentry commented Jun 26, 2023

JMS55 commented Jun 26, 2023

IceSentry left a comment

Choose a reason for hiding this comment

superdump left a comment

Choose a reason for hiding this comment

JMS55 commented Mar 25, 2023 •

edited

Loading