Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize spirv-builder clean build wall time #858

Open
1 of 5 tasks
repi opened this issue Apr 10, 2022 · 5 comments
Open
1 of 5 tasks

Optimize spirv-builder clean build wall time #858

repi opened this issue Apr 10, 2022 · 5 comments

Comments

@repi
Copy link
Contributor

repi commented Apr 10, 2022

We should look into improve our clean build times of when building spriv-builder & rustc_codegen_spirv as that is the main scenario in CI and also useful for users if our crates rebuild fast.

Right now it looks like we have a couple of quite unfortunate serial dependencies that significantly increases the wall time. Here is the current results with cargo build --release -p spirv-builder -Z timings on my AMD 5950x (32 vCPU):

image

Potential optimizations

  • Could spirv-builder be the one depending on spirv-tools instead of rustc_codegen_spirv?
    • I.e. move the responsibility of running spirv-opt/spirv-val and final linking to the builder.
    • Does that sound like it could be doable @eddyb ?
  • Remove/replace serde, quite crazy it takes 20 seconds to compile
    • Think we only have two use cases, our ModuleResult output and then the internal decorations that is using more complex serde implementation (CustomDecoration).
    • have no idea how serde+serde_json is taking such a crazy long time to build here though, can't see any other crates enabling tons of types that serde derive macros are used on or similar
    • ModuleResult would be trivial to replace serde with nanoserde, could work for CustomDecoration also?
    • Experiment to switch to nanoserde instead of serde #860
  • Remove/replace sanitize-filename dependency that compiles and uses regex crate for super trivial things
  • Optimize builds of rspirv itself somehow?
    • Unfortunate that it is so slow to build by itself but could be hard to fix and tons of types in it. No open issues on it in the rspirv repo as far as I could see
  • Determine why rustc_codegen_spirv is not split in frontend/codegen sections in the profile report.
    • This would enable spriv-builder itself to start building earlier and before the codegen of rustc_codegen_spirv is done.
    • Update: This is not possible as it it needs to be built as a dylib

If we do get rid of serde and manage to make sure rustc_codegen_spirv doesn't have to wait for the spirv-tools-sys build, we could get get a 15+ second wall time improvement here as rustc_codegen_spirv could start as soon as frontend section of rspirv has been built:
- image

@repi
Copy link
Contributor Author

repi commented Apr 10, 2022

Did an experiment to switch to nanoserde and it seems to work fine, there are some contraints and type/format output changes that one have to figure out (no PathBuf support), and does get rid of all the long serde compilation, 20 sec -> 1.7 sec.

Wall time benefit was just ~1 second though, but would enable real gains if one also finds a way not have rustc_codegen_spirv stall and wait for spirv-tools-sys build.

image

@repi

This comment was marked as outdated.

@bjorn3
Copy link
Contributor

bjorn3 commented Aug 8, 2022

You could split rustc_codegen_spirv into an rlib containing the actual implementation and a dylib which only depends on the rlib. The rlib will participate in pipelined compilation and the dylib will wait for all dependencies to have fully compiled before doing the final link. It may also be possible to move just the code using spirv-tools to the dylib, allowing most of the code of cg_spirv to be compiled at the same time as spirv-tools-sys.

@repi
Copy link
Contributor Author

repi commented Aug 8, 2022

ah that is an interesting idea and approach, what do you think of the feasibility of that @eddyb ?

@eddyb
Copy link
Contributor

eddyb commented Aug 8, 2022

Ahh I always forget about how dylibs behave just like executables, and the current state of pipelining - it makes sense in the grand scheme of there being no good way (yet) to block rustc invoking the linker, on the linker inputs actually being present, but it's basically a Cargo<->rustc communicative deficiency AFAIK, not even something limited by platform tooling.

What @bjorn3 is suggesting is basically "emulate what should happen, with today's tools" and we should probably do that.

If @oisyn wants to tackle this, a first approximation could be as simple as:

  • rename crates/rustc_codegen_spirv (e.g. add a suffix like -impl), and the name in its Cargo.toml, and take away any dylib-specific settings
  • create a minimal crates/rustc_codegen_spirv
    • Cargo.toml pretty much identical to what's currently there today (modulo deps?) and with an extra dependency on the new (e.g. rustc_codegen_spirv-impl) crate
    • src/lib.rs containing only extern crate rustc_codegen_spirv_impl; - in theory that should be enough to allow any exported symbols from the -impl crate to surface up to to the dylib and allow it to function

For @bjorn3's complete suggestion, I'm guessing this needs to go in the dylib:

/// This is the entrypoint for a hot plugged `rustc_codegen_spirv`
#[no_mangle]
pub fn __rustc_codegen_backend() -> Box<dyn CodegenBackend> {
// Override rustc's panic hook with our own to override the ICE error
// message, and direct people to `rust-gpu`.
let _rustc_hook = std::panic::take_hook();
let default_hook = std::panic::take_hook();
{
// NOTE(eddyb) the reason we can get access to the default panic hook,
// is that `std::panic::take_hook` has this phrase in its documentation:
//
// > If no custom hook is registered, the default hook will be returned.
//
// But just in case (races with other threads?), we can do it a few more
// times, and require that we get the same "boxed" ZST every time.
let more_hooks = [
std::panic::take_hook(),
std::panic::take_hook(),
std::panic::take_hook(),
std::panic::take_hook(),
];
assert_eq!(
std::mem::size_of_val(&*default_hook),
0,
"failed to acquire default panic hook using `std::panic::take_hook`, \
or default panic hook not a ZST anymore"
);
#[allow(clippy::vtable_address_comparisons)]
for other_hook in more_hooks {
assert!(
std::ptr::eq(&*default_hook, &*other_hook),
"failed to acquire default panic hook using `std::panic::take_hook`, \
or `std::panic::set_hook` was used on another thread"
);
}
}
std::panic::set_hook(Box::new(move |panic_info| {
default_hook(panic_info);
rustc_driver::report_ice(
panic_info,
"https://github.com/EmbarkStudios/rust-gpu/issues/new",
);
eprintln!("note: `rust-gpu` version {}\n", env!("CARGO_PKG_VERSION"));
}));
Box::new(SpirvCodegenBackend)
}

With that Box::new(SpirvCodegenBackend) at the very end being changed to Box::new(SpirvCodegenBackend { spirv_tools: ... }) (i.e. rustc_codegen_spirv-impl's definition of SpirvCodegenBackend needs a new field that holds some kind of trait object, function pointers, etc. - some mechanism to dynamically get access to the functionality we need from spirv_tools without it being statically accessed from rustc_codegen_spirv-impl).

@oisyn oisyn self-assigned this Nov 17, 2022
@oisyn oisyn removed their assignment Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@eddyb @repi @bjorn3 @oisyn and others