Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wgsl-in] Transition to a pass-based architecture #2065

Closed
wants to merge 21 commits into from

Conversation

SparkyPotato
Copy link
Contributor

@SparkyPotato SparkyPotato commented Sep 18, 2022

The current WGSL frontend does everything in a single pass, with parsing depending on input from the type-checker, etc. This creates problems like #1745, const eval, type inference, and also has generally poor diagnostics due to the lack of context.

This PR converts it into a more traditional pass-based architecture with four passes:

  1. Parsing.
  2. Name resolution.
  3. Declaration cycle detection, and topological ordering.
  4. Lowering to Naga IR, while doing type inference and const evaluation.

In its current state, it fixes:

  1. Support out-of-order module scope declarations in WGSL #1745
  2. Forbid local variable shadowing in WGSL #1044 according to the current WGSL spec.
  3. Adjust WGSL and IR for the break insite continue #1754 (I believe this has already been fixed?)
  4. Naga should not permit readable storage textures by default wgpu#4466: It makes all of Naga's native WGPU features WGSL enable directives, so you have to explicitly opt into using them while writing shaders.
  5. [wgsl-in] Remove fallthrough statement #2031
  6. WGSL front end rejects negative constant literal subscripts, but spec says they're permitted #989 by using only the dynamic Access expression for array indexing.
  7. [wgsl-in] Correctly parse lhs_expression for assignment/increment/decrement_statements wgpu#4383
  8. Global item does not support 'const' #2071
  9. [wgsl-in] allow functions to be declared literally after entry point #2072
  10. is "- 1" can be considered as const Expressions?  #2020
  11. [wgsl-in] Error on attempting to use mixed operands during compound assignment #1906
  12. Referencing a function without a return type yields an unknown identifier error. #1775
  13. [wgsl-in] An 'expected one of ..' lexer error system would be nice wgpu#4329

Planned features

  1. [wgsl-in] Support const operations in const variables #733
  2. Cannot define fractional constants #471
  3. Support override-expression for workgroup_size wgpu#4450
  4. [wgsl-in] Implement const expressions #1829
  5. wgsl-in: naga accepts switch statement with different integer types #1777
  6. Pipeline-overridable constants wgpu#4366
  7. [wgsl-in] Rejection of composite consts without explicit nested types #1957
  8. [wgsl-in] Confusing error: "expected unsigned/signed integer literal, found '1'" #1996
  9. Poor diagnostics for signed/unsigned match #2068
  10. [wgsl-in] Rejection of matrix consts #1956
  11. Naga front ends do not properly short-circuit && and || wgpu#4394
  12. [meta] f16 support wgpu#4384
  13. [meta] Advanced compute support #1154

Breaking language changes

There are some WGSL programs that were accepted by Naga before, but will no longer be accepted:

  1. All of Naga's extra native features will be gated behind WGSL features, and must be enabled manually, so the shader author explicitly opts out of supporting browser WebGPU.
    For example: binding_array can be enabled by putting a enable binding_array; at the top of the source.

    This is currently disabled.
  2. lets are no longer allowed in global declaration position, const must be used instead.

Breaking API changes

The frontend returns a Result<Module, Vec<WgslError>>, where WgslError is a type containing a diagnostic that can be trivially converted to a codespan Diagnostic.

Other changes

The frontend's diagnostics have been improved, and now multiple of them can be emitted at once.
A little example:
Given:

const a: i32 = b;
const b = c;
const c = S();

struct S {
    field: array<S, a>,
}

Since cyclic declarations are not allowed anywhere, Naga errors with:

error: cyclic declarations are not allowed
  ┌─ wgsl:1:7
  │
1 │ const a: i32 = b;
  │       ^        ^ uses
  │       │
  │       this declaration
2 │ const b = c;
  │       ^   ^ uses
3 │ const c = S();
  │       ^   ^ uses
4 │
5 │ struct S {
  │        ^
6 │     field: array<S, a>,
  │                     ^ ending the cycle

error: recursive declarations are not allowed
  ┌─ wgsl:5:8
  │
5 │ struct S {
  │        ^
6 │     field: array<S, a>,
  │                  ^ uses itself here

Performance regression

I have used a parser combinator library, chumsky for parsing. This causes a severe regression in performance (the front/wgsl benchmark takes 60 ms instead of 4ms).

However, the rest of the frontend has virtually no impact on performance, so regaining Naga's excellent performance should be possible by hand writing the parser.

Copy link
Member

@jimblandy jimblandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR isn't suitable for us to merge. I think the best way to proceed is for you to host it as your own crate, where you can experiment as you please, and show people how they can use your front end in their own wgpu code using wgpu::ShaderSource::Naga. That way everyone can choose the tradeoffs that make sense for them.

There are a number of reasons we don't want to take this PR.

First of all, as you noted, this front end is 15x slower than the current Naga WGSL front end. I used the following command on master and the multipass branch:

cargo bench --bench criterion front/wgsl -- --measurement-time 30

Here's the 'before' and 'after' plot (red = master, blue = multipass):

Screenshot 2022-10-02 at 16-20-21 front_wgsl - Criterion rs

Any fully compliant front end is almost certainly going to be a bit slower than our current front end, but I'm confident we can do a lot better than this.

Second, this PR takes on a bunch of new dependencies, which we would like to avoid. I haven't profiled, but I suspect that some of the slowdown is due to chumsky and logos; in particular, chumsky states in its README.md that it is much more concerned with the quality of the diagnostics and error recovery than with speed. That is a perfectly reasonable tradeoff, but again, I think we can do better.

The new dependencies also increase Naga compile time by around a third, for both naga + naga-cli - only builds and builds from scratch. We've tried hard to keep build times under control, since they impact developer experience pretty directly.

The new dependencies also create a review burden for embedders that want to use wgpu in security-sensitive code, like Firefox. Sometimes that's the right burden to choose - it's certainly not the case that nobody should ever add a new dependency. But in this case, I don't think the tradeoff is right.

Finally, the architecture just uses a lot of intermediate steps and intermediate representations where I think a more direct path would be both faster and more legible. This front end has three phases: parsing, resolution, and lowering to Naga IR. I'm pretty sure we can get by with just two, and use a representation for the unresolved source code that is a minor variant on Naga IR, which will make the code easier for people familiar with the rest of Naga to follow.

I haven't profiled, but I suspect the intermediate steps are also responsible for some of the loss of performance. That is, I suspect that rewriting the parser is not going to get you back as much speed as you expect.

I've put some other comments at specific places in the code, in case they're useful to you. But overall, it's really not a matter of fixing a few problems here and there. When you first brought up the idea of rewriting the front end in Naga chat, I tried to encourage you to take a more incremental approach, which would have made it easier for us to give feedback you could incorporate as you go, and preserve the crucial characteristics of the original front end, like performance and a light touch on dependencies. By jumping in with a full rewrite, those characteristics were lost.

I didn't look into this PR's resolution algorithm in detail, nor the abstract type handling, but I take your word that they're working; that's certainly an advantage over Naga's present state. And I definitely expect your front end to produce better diagnostics than ours (at least for now), which many users will find valuable. So I encourage you to continue development of this front end in your own repository, and perhaps publish it as a separate crate.

If that approach doesn't appeal to you, the naga/wgpu license was chosen specifically to ensure that you have the right to fork. This is a feature, not a bug, intended for dealing with situations like this, where different teams have different priorities.

But we expect to be able to bring Naga into compliance with the spec over the coming months while preserving most (probably not all) of its current performance.

pub struct ResolveContext {
dependency_context: DependencyContext,
index: Index,
reserved_matcher: AhoCorasick,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like an FxHashSet<Text> would be better here than an AhoCorasick matcher. You've already hashed the identifier once; it doesn't make sense to make another pass over it.

use crate::{front::wgsl::text::Text, Span};

#[derive(Clone, Debug, Default)]
pub struct TranslationUnit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This AST is very close to the grammar, but I don't think it's worth building all this intermediate structure. We should just touch the data once, lower it as far as we can immediately, and be done with it. The architecture here leads to odd bits like the check for generic parameters on the identifier in Resolver::expr; it seems like the code could be structured so that situations like that (and others) simply never arise.

Naga tries hard to gather everything into arrays (the Arena types, for example) and avoid heap allocation because such code runs very well on modern processors. As I say, I haven't profiled, but my guess is that you won't be able to get within 2x or 3x of the current Naga front end's run time without addressing this.

#[derive(Clone, Debug)]
pub enum ExprKind {
Underscore,
VarDecl(Box<VarDecl>),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very surprising to see variable declarations as an expression type. Overall, I had a hard time tracking how identifiers were resolved. I think this could be simplified a lot, while still implementing WGSL's semantics.

access_mode: Matcher<AccessMode>,
address_space: Matcher<AddressSpace>,
builtin: Matcher<Builtin>,
interpolation_sample: Matcher<InterpolationSample>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand why you want a separate Matcher for each role identifiers play in WGSL, but I think there are much nicer approaches to this. I outlined one possible idea in #1391.

@jimblandy jimblandy closed this Oct 4, 2022
@SparkyPotato
Copy link
Contributor Author

I do agree with most of what you're saying, but do I think I can incrementally combine most of the stuff here with what's on master with changes that aren't too large.

@SparkyPotato SparkyPotato deleted the multipass branch February 1, 2023 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants