-
-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu-dawn: reduce Dawn build and iteration times #124
Comments
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
After the changes above, we see some great gains:
Additionally I tracked down where time is spent:
After discussion with Jakub (zld author) it seems highly likely link time/perf can be improved esp. in those two functions above. |
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
While investigating slow build times with [a large project](hexops/mach#124), I found that the compiler was reading from disk nearly every C source file in my project when rebuilding despite no changes having been made. This accounted for several seconds of time (approx. 20-30% of running `zig build` without any changes to the sources.) The cause of this was that comparisons of file mtimes would _always_ fail (the mtime of the file on disk was always newer than that stored in the cache manifest), and so the cache logic would always fall back to byte-for-byte file content comparisons with what is on disk vs. in the cache-reading every C source file in my project from disk during each rebuild. Because file contents were the same, a cache hit occurred, and _despite the mtime being different the cache manifest would not be updated._ One can reproduce this by building a Zig project so the cache is populated, and then changing mtimes of their C source files to be newer than what is in the cache (without altering file contents.) The fix is rather simple: we should always write the updated cache manifest regardless of whether or not a cache hit occurred (a cache hit doesn't indicate if a manifest is dirty) Luckily, `writeManifest` already contains logic to determine if a manifest is dirty and becomes no-op if no change to the manifest file is necessary-so we merely need to ensure it is invoked. Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
Comparing build times on a really beefy Linux gaming laptop laptop (i7-10875H, 8 core 5.1ghz w/32GB RAM) vs. macOS M1 (original chipset, 16GB RAM):
Really interesting numbers here: Why is building on Linux from scratch so much slower? Maybe opengl+vulkan backend really takes that much longer to compile than metal backend? Why are iteration times on macOS so much worse? Guessing zld perf just isn't there yet, but that gives us an idea of how much room for improvement we can expect in zld in the future I guess! |
Investigating from-scratch build times, on M1 macOS:
Collected by swapping the end of + var timer = try std.time.Timer.start();
try s.make();
+ std.debug.print("{: >10}ms make '{s}'\n", .{timer.read()/std.time.ns_per_ms, s.name}); |
We could eliminate SPIRV support on macOS (would require contributing a change upstream to Dawn) which, from experiments, would reduce build times on macOS from 2m38s -> 1m57s (26% reduction.) We could also eliminate it on Windows if we only target DirectX. Linux would require it as both Vulkan and OpenGL backends require it. |
It's been pointed out in the dawn issues that abseil's use is trivial (just an overpowered string format), but an outsized part of the build for its contribution. The push back was that abseil has fast hashing and containers that maybe could be used in the future. In the mean time, the formatting routines rope in nearly the entire abseil library, spiraling down into time zones, language localizations, and so much more. Might be worth dropping more notes on the issue, highlighting that abseil is one of the larger time consuming build components. https://bugs.chromium.org/p/dawn/issues/detail?id=1148&q=abseil&can=2 |
@meshula thanks, and will do! Even if we eliminated abseil entirely, though, I still think build times here are kinda unacceptable. Almost all of the build time seems to come from spirv-tools, tint, DirectXShaderCompiler, and spirv-cross. I want to dig more into why these are so slow to compile, but I fear the real answer is just shader compilation + translation requires 2-4 different compilers":
I suspect that this amount of indirections is one of the reasons Dawn is likely to be so mature / less buggy, but also quite heavy. |
Exploring whether or not we can reduce the amount of spirv-tools code that gets pulled in:
|
Building spirv-tools goes from 84s -> 30s if we eliminate Tint's SPIRV reader and the spirv-tools optimizer, nice! |
Building spirv-tools goes from 30s->6s if we eliminate the dependency on the SPIRV validator (easy for macOS, probably doable on others) |
My current approach is native-as-possible; so Metal/DX/Vk as appropriate - the sad thing is Vk is most useful to me on the slowest platform, rpi, where I am stubbornly building natively rather than cross compiling from desktop. That said, I love the speed gains implied for Metal/DX platforms at least. Maybe the thing to do there is to have an explicit but optional cross-platform build for rpi, so that eating the Khronos tool chain build-pain is an optional choice? |
@meshula I am actually thinking we can have a build config option which is maybe on-by-default and fetches/uses prebuilt binaries for the target. You'd be able to toggle it off with the flip of a switch and get the build from source using just Zig, though (that's how the binaries would be produced) Thoughts on that? Also see #133 for another idea I have going on. |
Cached binaries makes a ton of sense for first time, and iteration purposes. A locally reproducible build can be used for air-gapped systems that can't pull binaries from the internet for whatever reason, and security audits. |
I looked into why gfx-rs/wgpu may be faster to compile than Dawn, some key differences I noticed:
|
Great news, Dawn no longer requires spirv-cross for OpenGL backends. This should speed up Linux compilation significantly, and reduce binary sizes a bit! hexops-graveyard/dawn@a52abab |
Dawn no longer uses spirv-cross for OpenGL backends: hexops-graveyard/dawn@a52abab Hence, we no longer need to compile it. Helps #124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
There's an upcoming demo about this, but |
See hexops/mach#124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See hexops/mach#124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See hexops/mach#124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See hexops/mach#124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
See hexops/mach#124 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
Update: Before I opened this issue, build and iteration times were quite slow. We've made good progress, keeping this issue open for further improvements. Results so far:
macOS M1 (original chipset) w/16GB RAM:
zig build
actionlibgpu.a
sizedawn-example
sizeThe text was updated successfully, but these errors were encountered: