-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wgsl-in] Transition to a pass-based architecture #2065
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR isn't suitable for us to merge. I think the best way to proceed is for you to host it as your own crate, where you can experiment as you please, and show people how they can use your front end in their own wgpu
code using wgpu::ShaderSource::Naga
. That way everyone can choose the tradeoffs that make sense for them.
There are a number of reasons we don't want to take this PR.
First of all, as you noted, this front end is 15x slower than the current Naga WGSL front end. I used the following command on master
and the multipass
branch:
cargo bench --bench criterion front/wgsl -- --measurement-time 30
Here's the 'before' and 'after' plot (red = master, blue = multipass):
Any fully compliant front end is almost certainly going to be a bit slower than our current front end, but I'm confident we can do a lot better than this.
Second, this PR takes on a bunch of new dependencies, which we would like to avoid. I haven't profiled, but I suspect that some of the slowdown is due to chumsky
and logos
; in particular, chumsky
states in its README.md
that it is much more concerned with the quality of the diagnostics and error recovery than with speed. That is a perfectly reasonable tradeoff, but again, I think we can do better.
The new dependencies also increase Naga compile time by around a third, for both naga + naga-cli - only builds and builds from scratch. We've tried hard to keep build times under control, since they impact developer experience pretty directly.
The new dependencies also create a review burden for embedders that want to use wgpu
in security-sensitive code, like Firefox. Sometimes that's the right burden to choose - it's certainly not the case that nobody should ever add a new dependency. But in this case, I don't think the tradeoff is right.
Finally, the architecture just uses a lot of intermediate steps and intermediate representations where I think a more direct path would be both faster and more legible. This front end has three phases: parsing, resolution, and lowering to Naga IR. I'm pretty sure we can get by with just two, and use a representation for the unresolved source code that is a minor variant on Naga IR, which will make the code easier for people familiar with the rest of Naga to follow.
I haven't profiled, but I suspect the intermediate steps are also responsible for some of the loss of performance. That is, I suspect that rewriting the parser is not going to get you back as much speed as you expect.
I've put some other comments at specific places in the code, in case they're useful to you. But overall, it's really not a matter of fixing a few problems here and there. When you first brought up the idea of rewriting the front end in Naga chat, I tried to encourage you to take a more incremental approach, which would have made it easier for us to give feedback you could incorporate as you go, and preserve the crucial characteristics of the original front end, like performance and a light touch on dependencies. By jumping in with a full rewrite, those characteristics were lost.
I didn't look into this PR's resolution algorithm in detail, nor the abstract type handling, but I take your word that they're working; that's certainly an advantage over Naga's present state. And I definitely expect your front end to produce better diagnostics than ours (at least for now), which many users will find valuable. So I encourage you to continue development of this front end in your own repository, and perhaps publish it as a separate crate.
If that approach doesn't appeal to you, the naga/wgpu license was chosen specifically to ensure that you have the right to fork. This is a feature, not a bug, intended for dealing with situations like this, where different teams have different priorities.
But we expect to be able to bring Naga into compliance with the spec over the coming months while preserving most (probably not all) of its current performance.
pub struct ResolveContext { | ||
dependency_context: DependencyContext, | ||
index: Index, | ||
reserved_matcher: AhoCorasick, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like an FxHashSet<Text>
would be better here than an AhoCorasick
matcher. You've already hashed the identifier once; it doesn't make sense to make another pass over it.
use crate::{front::wgsl::text::Text, Span}; | ||
|
||
#[derive(Clone, Debug, Default)] | ||
pub struct TranslationUnit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This AST is very close to the grammar, but I don't think it's worth building all this intermediate structure. We should just touch the data once, lower it as far as we can immediately, and be done with it. The architecture here leads to odd bits like the check for generic parameters on the identifier in Resolver::expr
; it seems like the code could be structured so that situations like that (and others) simply never arise.
Naga tries hard to gather everything into arrays (the Arena
types, for example) and avoid heap allocation because such code runs very well on modern processors. As I say, I haven't profiled, but my guess is that you won't be able to get within 2x or 3x of the current Naga front end's run time without addressing this.
#[derive(Clone, Debug)] | ||
pub enum ExprKind { | ||
Underscore, | ||
VarDecl(Box<VarDecl>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very surprising to see variable declarations as an expression type. Overall, I had a hard time tracking how identifiers were resolved. I think this could be simplified a lot, while still implementing WGSL's semantics.
access_mode: Matcher<AccessMode>, | ||
address_space: Matcher<AddressSpace>, | ||
builtin: Matcher<Builtin>, | ||
interpolation_sample: Matcher<InterpolationSample>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand why you want a separate Matcher
for each role identifiers play in WGSL, but I think there are much nicer approaches to this. I outlined one possible idea in #1391.
I do agree with most of what you're saying, but do I think I can incrementally combine most of the stuff here with what's on master with changes that aren't too large. |
The current WGSL frontend does everything in a single pass, with parsing depending on input from the type-checker, etc. This creates problems like #1745, const eval, type inference, and also has generally poor diagnostics due to the lack of context.
This PR converts it into a more traditional pass-based architecture with four passes:
In its current state, it fixes:
enable
directives, so you have to explicitly opt into using them while writing shaders.Access
expression for array indexing.lhs_expression
forassignment/increment/decrement_statement
s wgpu#4383Planned features
const
operations inconst
variables #733workgroup_size
wgpu#4450&&
and||
wgpu#4394Breaking language changes
There are some WGSL programs that were accepted by Naga before, but will no longer be accepted:
All of Naga's extra native features will be gated behind WGSL features, and must beenable
d manually, so the shader author explicitly opts out of supporting browser WebGPU.For example:
binding_array
can be enabled by putting aenable binding_array;
at the top of the source.This is currently disabled.
let
s are no longer allowed in global declaration position,const
must be used instead.Breaking API changes
The frontend returns a
Result<Module, Vec<WgslError>>
, whereWgslError
is a type containing a diagnostic that can be trivially converted to a codespanDiagnostic
.Other changes
The frontend's diagnostics have been improved, and now multiple of them can be emitted at once.
A little example:
Given:
Since cyclic declarations are not allowed anywhere, Naga errors with:
Performance regression
I have used a parser combinator library,
chumsky
for parsing. This causes a severe regression in performance (thefront/wgsl
benchmark takes 60 ms instead of 4ms).However, the rest of the frontend has virtually no impact on performance, so regaining Naga's excellent performance should be possible by hand writing the parser.