-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute example #22
base: master
Are you sure you want to change the base?
Compute example #22
Conversation
Thanks for putting in the effort on this - I've not had chance to test it yet, but if it runs and produces the expected output then I'd say it's a strong start! |
Not yet. I need to read more about how bevy handles shaders and didn't have the energy to do so yet. Planning on taking another stab this or the next weekend |
I hope that I am missing something obvious, but the ways of achieving the result, that I see, feel a bit backwards. The current problem is that the bevy compute pipeline requires the bevy shader handle, I initially thought it will be easiest to just do this processing in the render graph node "update" method, I think maybe at this point the most straightforward way is to provide a custom AssetLoader, that builds a bevy |
I ended up in a situation where I think it should work, but it doesn't and it doesn't even complain that something is wrong. |
I'll take another stab in two weeks unless someone figures it out sooner |
Sounds like great progress, I'm excited to try it out! |
Wasn't able to debug why nothing is showing, lost the remainder of my motivation reading through the SPIR-V specification. Might take another stab in a few months |
Ha yeah SPIR-V is pretty esoteric. Awesome work @samoylovfp 🙇 |
That's great! Do you think the notably slower frame rate is because of Bevy's |
That old noise crate was the only no-std crate that worked - check rust-gpu/shader/lib/noise. To see if it's really slow or not, we'd have to translate it and compare with WGSL on same GPU - otherwise the comparison with CPU is meaningless (who knows what intrinsics magically show up on the CPU side?) Anyway, it's a starting point for doing your own computation and benchmarks. |
If the maintainer is still interested i can make separate PRs for each of the vendored codebases - otherwise, it's really a bother to work with many little crates spread around, when I need to upgrade dependencies in each and every one |
The timings of GPU vs. CPU for the noise crate:
I think the noise crate is a little bit too much compute for each thread. I'm sure other tasks are better suited for this - through I'd first translate some WGSL/GLSL compute benchmarks into Rust first and see if there are major losses in SPIR-V/SPIR-T/whatever EDIT Actually only 4x slower than CPU - think it's working correctly |
here is game of life in 80 lines, running at 4k in 60fps. ported this: https://github.com/bevyengine/bevy/blob/v0.12.0/assets/shaders/game_of_life.wgsl setting "NO VSYNC" in bevy doesn't actually let the game go over 60fps - there's probably some way we can trace into the compute shader runtime and get the compute shader runtime from there? #![no_std]
#![feature(asm_experimental_arch)]
use spirv_std::{
spirv,
glam::{UVec3, IVec2, Vec4}, Image,
};
fn hash(value: u32) -> u32 {
let mut state = value;
state = state ^ 2747636419;
state = state * 2654435769;
state = state ^ state >> 16;
state = state * 2654435769;
state = state ^ state >> 16;
state = state * 2654435769;
return state;
}
fn randomFloat(value: u32) -> f32 {
return (hash(value) as f32) / 4294967295.0;
}
pub type Image_2D_SNORM = Image!(2D, format=rgba8_snorm, sampled=false);
fn is_alive(location: IVec2, offset_x: i32, offset_y: i32, image: &Image_2D_SNORM) -> i32 {
let value= image.read(location + IVec2::new(offset_x, offset_y));
return value.x as i32;
}
fn count_alive(location: IVec2, image: &Image_2D_SNORM) -> i32 {
return is_alive(location, -1, -1, image) +
is_alive(location, -1, 0, image) +
is_alive(location, -1, 1, image) +
is_alive(location, 0, -1, image) +
is_alive(location, 0, 1, image) +
is_alive(location, 1, -1, image) +
is_alive(location, 1, 0, image) +
is_alive(location, 1, 1, image);
}
#[spirv(compute(threads(8,8)))]
pub fn init(
#[spirv(global_invocation_id)] id: UVec3,
#[spirv(num_workgroups)] num: UVec3,
#[spirv(descriptor_set = 0, binding = 0)] texture: &Image_2D_SNORM,
) {
let coord = IVec2::new(id.x as i32, id.y as i32);
let randomNumber = randomFloat(id.y * num.x + id.x);
let alive = randomNumber > 0.9;
let alive_f = alive as i32 as f32;
let pixel = Vec4::new(alive_f, alive_f, alive_f, 1.0);
unsafe {
texture.write(coord, pixel);
}
}
#[spirv(compute(threads(8,8)))]
pub fn update(
#[spirv(global_invocation_id)] id: UVec3,
#[spirv(num_workgroups)] num: UVec3,
#[spirv(descriptor_set = 0, binding = 0)] texture: &Image!(2D, format=rgba8_snorm, sampled=false),
){
let coord = IVec2::new(id.x as i32, id.y as i32);
let n_alive = count_alive(coord, texture);
let alive = n_alive == 3 || n_alive == 2 && is_alive(coord, 0, 0, texture) == 1;
let alive_f = alive as i32 as f32;
let pixel = Vec4::new(alive_f, alive_f, alive_f, 1.0);
unsafe { spirv_std::arch::workgroup_memory_barrier_with_group_sync() };
unsafe {
texture.write(coord, pixel);
}
} |
I've also thrown in some of my game logic (ballistic solution for 1st order viscosity) and it works exactly as expected, with speeds comparable to CPU rust (5x slower) and no value error I can't wait not to learn WGSL thanks for the code! |
Also I've set up a docker build process on the fork, so you don't have to install the nightly rust from 6 months ago on the host. Another note is that for compute shaders, the So it might be easier to use Maybe after the |
That's a lot of great info and insight. Even if its slower, it's just great to know that it all works. I'm still a newbie to all this, so it'll take me a while to pore over everything. |
Trying to fix Bevy-Rust-GPU/bevy-rust-gpu#20
Reading the sources of
And trying to mash everything together