Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add type for unreachable states #1379

Open
RossTate opened this issue Sep 16, 2020 · 67 comments
Open

Proposal: Add type for unreachable states #1379

RossTate opened this issue Sep 16, 2020 · 67 comments

Comments

@RossTate
Copy link

Problem

The current design of the unreachable instruction (and br and the like) imposes an additional decidability constraint on WebAssembly: for any instruction sequence, we need to be able to determine if there exists a typing for the sequence.

This problem will grow more difficult as WebAssembly grows (until we make it trivial with this proposal). Even just the small complexity introduced by adding reference types has caused a bug in the reference interpreter (WebAssembly/reference-types#111), imposing yet another layer of complexity on the reference interpreter solely for type-checking unreachable code. This will be made worse by br_on_null in the typed references proposal, which will require adding a "unknown reference type" case.

This decidability constraint seems to already be restricting the design of WebAssembly. For example, here in #1365 it is raised as an issue for adding a dup instruction. This is because (unannotated) dup and pick instructions would require you to reason that the duplicated type is used consistently across multiple instructions, requiring the type-checker to track constraints on type variables and ensure they're satisfiable, all for the sake of just type-checking unreachable code.

Short-Term Solution

A short-term solution is to add an unreachable type (it seems specifically to be a result type) to the typing rules but not to the grammar. Any (syntactically valid) instruction sequence would have type unreachable -> unreachable. Because this changes nothing besides validation and accepts strictly more programs, we could make this change to any proposal spec and it would be backwards compatible.

This solution does not seem to have been discussed in the design rationale. It also seems to be compatible with all the desirable properties provided there except for the following:

It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).

However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).

Long-Term Solution

A long-term solution would extend the grammar to account for an unreachable type. For example, a function could declare its type as [i32 i32] -> unreachable. WebAssembly compilers could even use this information to optimize calls to such functions, i.e. no need to save registers before the call if the call will never return! (Though of course you still have to consider exceptions.) This solution would involve many more design decisions, some of which might be informed by other discussions that are in flight, so it would probably be best to explore this option as part of some larger effort rather than as an isolated proposal.

@rossberg
Copy link
Member

This exact type was previously introduced as part of the reference types proposal, for the reasons you mention, and some others (it wasn't needed for the MVP, because the type algebra was simple enough). However, as a bottom type, it relies on subtyping, so it got removed in response to the request to remove subtyping. I explicitly called out that this also axes the bottom type. I'm not surprised that we overlooked an implication like this.

Subtyping and the bottom type are currently reintroduced in the func-ref proposal. But it looks like we have to move it back to reftypes. :(

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 16, 2020

I agree that at least the short-term solution is desirable (I sketched out the formal rules here).

FWIW, the design rationale you linked does explicitly address this:

Other solutions that have been discussed would lose most of the above properties. For example:
|...snip...|
Allowing but not type-checking unreachable code would break decomposability and requires the spec to provide a syntactic definition of reachability (which is unlikely to match the definition otherwise used by producers).

I think the concerns about the "syntactic definition of reachability" are overstated. To implement a type checker today, exactly the same set of instructions needs to be handled specially because of their polymorphic output types. I'm also not sure if anyone is relying on decompositionality (I could believe reliance on the converse).

The comparison to the JVM at the end of that rationale doesn't appear to be accurate. The analogy in Wasm for the JVM's approach would be to completely forbid syntactically dead code.
(Quote from the latest JVM standard (link))

It is illegal to have code after an unconditional branch without a stack map frame being provided for it.

So the JVM does not have to deal with any polymorphism issues in dead code. In an earlier version of the JVM (I think prior to the introduction of stack maps), return explicitly set the output type to empty (link).

FWIW, another "design smell" surrounding this - when I implemented a verified type checker for Wasm, easily 90% of the effort both in implementation and in correctness proof was concerned purely with handling the dead code case. This isn't precise but it's a rough indicator for the mental overhead this design decision is causing on type checking, even before typed objects are introduced.

EDIT: missed Andreas' response

This exact type was previously introduced as part of the reference types proposal, for the reasons you mention, and some others (it wasn't needed for the MVP, because the type algebra was simple enough). However, as a bottom type, it relies on subtyping, so it got removed in response to the request to remove subtyping.

I don't think this type should rely on subtyping. It's the type of the whole stack, not the type of an individual element. To allow dead code to skip validation, all you need are the typing rules

C ⊢ unreachable : t* -> ⊥
<similar for return, br, br_table>

C ⊢ e* : ⊥ -> t*

This should work even if t* can contain object types.

@RossTate
Copy link
Author

My impression of the discussion about subtyping for reference types was that it was centered around a (near-)bottom heap type. I only just learned that there was a bottom value type in the proposal as well. As @conrad-watt points out, the suggestion here is a bottom state/result type, which avoids many of the problems of bottom heap/value types while providing a simpler solution to the same problem that the bottom value type seems to have been solving. (It's also simple enough that its rules can be formulated without explicitly adding subtyping.)

FWIW, the design rationale you linked does explicitly address this:

The solution here is slightly different because it gives a type to unreachable code. This, for example, makes it still decomposable. So same practical effect, but with slightly nicer theoretical properties.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 16, 2020

The solution here is slightly different because it gives a type to unreachable code. This, for example, makes it still decomposable. So same practical effect, but with slightly nicer theoretical properties.

In the short-term solution, the "type" is just a formal artefact to express skipping validation. If the "skipping validation" approach had been agreed on at an informal level, it would have almost certainly been formalised in this way. I agree that the approaches diverge when considering the long-term solution.

This, for example, makes it still decomposable.

I don't believe the resulting language would be decomposable in the same way that the current language is. I also don't believe this is a big deal :P

@RossTate
Copy link
Author

Agreed. So do we have insights on reasons to not skip validation that have not been addressed above?

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 16, 2020

It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).

However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).

I'd like to hear implementers' perspectives on how dead code is currently handled (e.g. in streaming compilation), and whether they'd have any concerns if it was no longer validated. The quoted point (from @RossTate's OP) seems very reasonable.

@rossberg
Copy link
Member

@conrad-watt:

I think the concerns about the "syntactic definition of reachability" are overstated.

Not really. We actually had lengthy and heated discussions about the details of that at the time. For example, whether it should extend beyond a block's end. There is no canonical definition. At the same time, an impedance mismatch in the notion of unreachable would get in the way of codegen.

The analogy in Wasm for the JVM's approach would be to completely forbid syntactically dead code.

The relevant aspect of the analogy is that the JVM does not forbid dead code, and that's pretty important for producers. It merely requires it to be type-annotated.

I don't think this type should rely on subtyping.

If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable). Either way, it's a separate, more complex extension. We could add that, too, but it seems less bang for the bucks, in the sense that it won't change engines, merely complicate the spec. Depending on the instruction set, it might not even replace the bottom value type (e.g., if you had pick without a stack type annotation).

FWIW, even on stack types, it would still be a form of subtyping (or something largely equivalent).

In the short-term solution, the "type" is just a formal artefact to express skipping validation.

That's not correct. The bottom value type does not "skip validation", it merely makes it more permissible. In particular, something like (unreachable) (i32.const 0) (i64.add) is still rejected.

I'm also not sure if anyone is relying on decompositionality

In general, it means that dead code is still type-checked to the extend that it could be executed. As I mentioned elsewhere, that e.g. addressed potential security concerns.

likely by simply not compiling unreachable code (which addresses the attack vector mentioned here)

Among other things, that work-around doesn't apply to interpreters.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 16, 2020

Not really. We actually had lengthy and heated discussions about the details of that at the time. For example, whether it should extend beyond a block's end. There is no canonical definition. At the same time, an impedance mismatch in the notion of unreachable would get in the way of codegen.

My point is that all implementations have converged around a set of instructions which, at validation time, must produce PolyStack [] as an output type. It is precisely these instructions that would be considered "syntactically dead", and now produce . I don't think there would be the same heated debate again. Deliberately or not, WebAssembly has effectively already adopted a definition of syntactic deadness through its non-deterministic output types.

If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable).

I addressed this specifically with the additional typing rule C ⊢ e* : ⊥ -> t*. Generating a type as output allows remaining instructions in a straight line to be skipped. This also confines the effects of to within the current block, which seems the obvious choice.

Note that with @RossTate's extended proposal, which I'm not arguing for, block types could explicitly declare their output type as .

That's not correct. The bottom value type does not "skip validation", it merely makes it more permissible. In particular, something like (unreachable) (i32.const 0) (i64.add) is still rejected.

See above (also, in this proposal, is the type of the whole stack, not of a single stack element - it's not the same as the bottom type from the earlier reference types draft).

Among other things, that work-around doesn't apply to interpreters.

I remember you saying that WebAssembly shouldn't be designed for interpreters! In any case, the code still has to be parsed; a security-conscious interpreter would have options to harden itself. But this all seems like a hypothetical.

EDIT:

The relevant aspect of the analogy is that the JVM does not forbid dead code, and that's pretty important for producers. It merely requires it to be type-annotated.

I was a little off with my analogy - I should have focussed on the point that, on the consumer side, the JVM has never had to handle this kind of polymorphic typing of dead code. I can almost believe that forbidding dead code could be annoying for some producers, but do they care about it being validated? Maybe Wasm could have been designed with input type annotations for dead code from the start, but we're now constrained by webcompat.

@RossTate
Copy link
Author

My impression from @rossberg's first post and from @conrad-watt is that neither of y'all have objections to at least the short-term solution, so I'd like to hear directly from others about any objections they have.

@conrad-watt
Copy link
Contributor

My impression from @rossberg's first post and from @conrad-watt is that neither of y'all have objections to at least the short-term solution, so I'd like to hear directly from others about any objections they have.

@rossberg's last comment was explicitly an objection to (his understanding of) the short-term solution. I hope he finds my arguments convincing, and I'm very interested to hear from implementers, but we shouldn't put words in his mouth.

@RossTate
Copy link
Author

Ah, I see how I misread it. Thanks for correcting me.

@fitzgen
Copy link
Contributor

fitzgen commented Sep 16, 2020

Probably ~half the tricky fuzz bugs we've found in the wasmparser crate's validator (which powers wasmtime, cranelift, and aarch64 spidermonkey) and in cranelift's wasm-to-ir translator involve unreachable code, often with its interaction with new proposals (e.g. multi-value).

I've heard similar things from people who've hacked on spidermonkey's other wasm validator; hopefully those folks can speak up here as well.

The short term proposal (where we limit its effects to within a block) sounds like a great idea to me.

@tlively
Copy link
Member

tlively commented Sep 16, 2020

Binaryen has always had an explicit unreachable type in its IR that behaves almost like the proposed long-term solution, so the short-term solution would not require any meaningful changes to most of Binaryen and would allow some nice simplifications in some of Binaryen. It would also help reduce the cognitive overhead of working with Binaryen IR by making its unreachable handling more similar to Wasm's unreachable handling.

@binji
Copy link
Member

binji commented Sep 16, 2020

Binaryen has always had an explicit unreachable type in its IR that behaves almost like the proposed long-term solution

Is that true? I was under the impression that binaryen's unreachable was quite a bit more complex than described here.

I addressed this specifically with the additional typing rule C ⊢ e* : ⊥ -> t*. Generating a ⊥ type as output allows remaining instructions in a straight line to be skipped.

It took me a bit to realize how this works. For others who were confused like me: the idea is that if ⊥ is on top of the type stack during validation, then this rule can consume any number of expressions (e*), and yield any number of result types (t*). This would allow it to match the expected results for that block.

EDIT: actually, ⊥ must be the entire type stack, see Conrad's comment below.

often with its interaction with new proposals (e.g. multi-value).

Agreed, I've found more cases where unreachability is complex in the newer proposals too (e.g. GC) when the annotations are removed. I believe this is essentially the same problem as select.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 16, 2020

the idea is that if ⊥ is on top of the type stack during validation, then this rule can consume any number of expressions (e*), and yield any number of result types (t*)

Just to be 100% precise, with this proposal, a type stack can be either t* or , so isn't just the top of the type stack, it's the whole thing. We wouldn't have to handle anything like t : ⊥ or ⊥ : t.

EDIT:
Just to note again, without @RossTate's extended proposal, wouldn't actually show up in any program type annotations. It's purely a formal contrivance to represent validation being skipped.

Is that true? I was under the impression that binaryen's unreachable was quite a bit more complex than described here.

Agree there are probably mismatches (no worse than currently). For example I think binaryen can label the output of an infinite loop as unreachable independent of its explicit type annotation.

@tlively
Copy link
Member

tlively commented Sep 16, 2020

Binaryen's unreachable is similar to the long term solution in that blocks, loops, and other control flow structures can be unreachable. The complexity comes from Binaryen's rules about when they cannot be unreachable, e.g. when they are branch targets. So actually an infinite loop cannot be unreachable in Binaryen because it is necessarily a branch target. (Edit: this is true for blocks but not loops, so infinite loops do end up being unreachable. Complicated!) But those details aren't so important here.

Instructions in Binaryen that are not control flow structures and not unreachable polymorphic can have unreachable type only when they have unreachable-typed children. So Binaryen's i32.add instruction can have types (i32, i32) -> i32, (unreachable, i32) -> unreachable, (i32, unreachable) -> unreachable, or (unreachable, unreachable) -> unreachable. Under those rules, instructions that take no arguments cannot be unreachable, so for example f32.const only ever has type () -> f32. With this proposal, we would have to allow instructions with no arguments to be unreachable as well, for example letting f32.const have type () -> unreachable. But that's a trivial change that should be mostly mechanical.

@conrad-watt
Copy link
Contributor

So Binaryen's i32.add instruction can have types (i32, i32) -> i32, (unreachable, i32) -> unreachable, (i32, unreachable) -> unreachable, or (unreachable, unreachable) -> unreachable

That sounds a little different, because you're using unreachable as the type of a single stack element, rather than as the type of the whole stack (see my last comment above). In any case, it's not crucial that Binaryen's internals match the Wasm spec types precisely (considering they already don't!).

@taralx
Copy link

taralx commented Sep 17, 2020

Given that we have no "unreachable" type, and blocks have to have types, the only places I can think of where someone is type-checking with an unknown is between a branch and an "end", correct? Or did I miss something?

@RossTate
Copy link
Author

I believe that's correct (albeit I'm implicitly refining "branch" to unreachable/br/throw and the like).

@rossberg
Copy link
Member

rossberg commented Sep 17, 2020

@conrad-watt:

My point is that all implementations have converged around a set of instructions which, at validation time, must produce PolyStack [] as an output type. It is precisely these instructions that would be considered "syntactically dead", and now produce ⊥.

You may assume that with hindsight, but all I can say is that there was no agreement on this at the time, and different definitions were proposed by different folks. The current, more canonical semantics actually served as a kind of forcing function for the solution you see now.

If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable).

I addressed this specifically with the additional typing rule C ⊢ e* : ⊥ -> t*. Generating a ⊥ type as output allows remaining instructions in a straight line to be skipped.

Note how I specifically said "If you want to keep checking dead code", which this would not be addressing.

I remember you saying that WebAssembly shouldn't be designed for interpreters!

Well, the context probably was that its design doesn't need to be optimised for interpreters, which was true for long (although I think @titzer changed his mind about that recently). But security obviously is a different consideration.

@rossberg's last comment was explicitly an objection to (his understanding of) the short-term solution.

Right, I was initially misreading that as the bottom value type solution, which is just a minor tweak to what all engines are implementing already. The bottom result type is a more fundamental and likely more controversial change.

But just to be clear, I'm just conveying the discussions we had. The all-code-must-be-valid argument isn't mine (even though I'm somewhat sympathetic). IIRC, it was several engine folks who pushed it most strongly at the time, but I don't remember the details.

@RossTate's OP:

This [the short-term] solution does not seem to have been discussed in the design rationale.

It is a version of the don't-check-dead-code solution (for one specific definition of dead code), which is discussed in the doc, and was originally discussed and rejected by implementers, see above. Whether that has changed I don't know.

It also seems to be compatible with all the desirable properties provided there except for the following:

It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).

However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).

You are talking about the implementation details of consumers, while the statement you cite explicitly talks about semantics and producers targeting that semantics. (A consumer OTOH has many reasons to care about reachability, but will typically use a completely different definition, e.g., when it's decoding directly into SSA form. As for type-directed compilation, that's a not an issue, it can just propagate bottom to an arbitrary type without otherwise caring.)

@rossberg
Copy link
Member

@fitzgen:

Probably ~half the tricky fuzz bugs we've found in the wasmparser crate's validator (which powers wasmtime, cranelift, and aarch64 spidermonkey) and in cranelift's wasm-to-ir translator involve unreachable code

That's not surprising, since this is the path that will be least well-explored by regular tests. The question is what this implies. Arguing against extra safety checks solely on the basis that they are harder to test is a rather problematic proposition.

@titzer
Copy link

titzer commented Sep 17, 2020

Having implemented dead code validation in multiple engines and having participated in the history of this unfortunate topic from the beginning, I'll just heave a sigh here first. There were loud voices that called for banning unreachable code and other loud voices going the other way. We went back and forth in multiple iterations, and it seems like it keeps popping up at every new addition to the type system. It'll probably never be clean to do and despite our devotion to the formalism, implementors will continue to just wing it and sort it out via test cases in the spec suite. So suffice is to say, having a really simple mental model for it and a great test suite will be very valuable in a concrete way.

The union of all proposals in flight is actually nontrivial to do in a clean, fast way.

Let me just offer this perspective for how I've done this in the past. I'm not proposing any specific to change to the formalism.

Pretty much all fast validators track unreachability as a bit in the control stack--i.e. syntactically reachable or not, for the current control block. A natural way to write this is to hide unreachability behind the "pop" operation. There are actually three pop operations:

  • popping a value with an expected type from the value stack
  • popping a polymorphic type from the value stack with specific constraint, e.g. that it is a reference type or function reference type
  • popping a value of any type from the stack (drop)

After an instruction that ends control flow (e.g. throw, unreachable, return_call), the value stack is emptied back to the start of the current control block and marked unreachable. Any pop operation beyond the top of the current control block would normally be an error, except if the control is marked unreachable, and then it might succeed, depending on the situation above.

There are some wrinkles, and it actually turns out that you cannot hide unreachability behind the pop operation always, particularly because of some missing type annotations in the function references proposal.

  • in the function references proposal, call.ref should pop a function reference off the stack. If the stack is empty and unreachable, then this succeeds, but the stack continues to be unreachable and no new value types are pushed. Because there is no type annotation for call.ref, there is no known type to be pushed, so nothing is.
  • In contrast, func.bind does have an output type annotation and always pushes a funcref of that type, regardless of whether the control stack was reachable or not.

There are other special cases.

Overall the lack of type annotations in various places makes the current set of in-flight proposals kind of baroque.

As for interpreters, my position has, as politicians like to say, "evolved". I'd summarize it as: Wasm MVP was not designed for interpreters but it was designed for fast validation and was simple enough that modulo control flow, it can be dealt with in an interpreter fairly OK. To preserve some of the good properties that it does have, we mostly just have to avoid mistakes. We don't need to hyper optimize for interpreters, just avoid egregiously bad choices.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 17, 2020

@titzer thanks for this! The unreachable-handling approach you're describing is what gave me confidence that there would be fewer debates over the definition of syntactic deadness this time around.

It'll probably never be clean to do and despite our devotion to the formalism, implementors will continue to just wing it and sort it out via test cases in the spec suite. So suffice is to say, having a really simple mental model for it and a great test suite will be very valuable in a concrete way.

The union of all proposals in flight is actually nontrivial to do in a clean, fast way.

At the risk of this comment being referenced in a future post-mortem, I do strongly believe that the block-local unreachable change would significantly simplify the mental model. There may have been a paralyzing number of alternatives when the language was initially designed, but due to current webcompat constraints our conversation can (maybe accidentally) be much more focussed.

with reference to @rossberg...

You may assume that with hindsight, but all I can say is that there was no agreement on this at the time, and different definitions were proposed by different folks. The current, more canonical semantics actually served as a kind of forcing function for the solution you see now.

I agree that the current type checking conventions are mainly an emergent property of the formal design. It would be good to get perspectives from current engine implementers about appetite for this change. I'm not sure who to tag - maybe @lukewagner and @gahaas?

@titzer
Copy link

titzer commented Sep 17, 2020

@conrad-watt If I read your proposal right, that would just mean that push() is a no-op if the top of the control stack is marked as unreachable, correct? If so, then it would be a slight simplification of some of the cases.

IIRC one argument against this that was that obvious type errors in unreachable code would be suppressed, e.g. (unreachable) (f32.const 0) (f32.const 0) (i32.add) would pass validation.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 17, 2020

@conrad-watt If I read your proposal right, that would just mean that push() is a no-op if the top of the control stack is marked as unreachable, correct? If so, then it would be a slight simplification of some of the cases.

Not just push, but also pop. So there would no longer need to be any special polymorphic handling for dead code.

A revised algorithm could always work with a concrete type stack. When "an instruction that ends control flow" (ref. your post) is encountered, validation empties/discards the current control block's type stack, skips to the end of the block, pushes the block's output type annotation, then continues.

EDIT: it is still necessary to decode and check certain syntactic constraints on the skipped instructions, see (#1379 (comment))

IIRC one argument against this that was that obvious type errors in unreachable code would be suppressed, e.g. (unreachable) (f32.const 0) (f32.const 0) (i32.add) would pass validation.

Yes, this would happen (although such code would still have to be parsed).

@RossTate
Copy link
Author

@titzer And to clarify, it's not the "top of the control stack" that is typed as unreachable; it is the entire control stack that is typed as unreachable.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 17, 2020

@RossTate: I think @titzer's control stack is referring to nested control blocks - with one type/value stack associated with each entry on the "control stack". So the top of the control stack is marked unreachable (the innermost block being typed) - which is equivalent to declaring the bottom of the current type stack to be polymorphic/unconstrained.

In the revised algorithm, this unreachable bit on the control stack wouldn't be needed at all (except potentially to implement the "skip-to-end" handling).

@titzer
Copy link

titzer commented Sep 17, 2020

When "an instruction that ends control flow" (ref. your post) is encountered, validation empties/discards the current control block's type stack, skips to the end of the block, pushes the block's output type annotation, then continues.

We discussed such an alternative too. That has the downside that facts about immediates (e.g. that indices are in bounds) are not checked. That means you need to have two modes to the validation algorithm; one that skips immediates (i.e. basically a bytecode iterator) and one that does not. This is because you can't find the end bytecode without properly decoding everything in-between.

Hiding everything in the various pop operations has the distinct advantage that the core logic for each bytecode generally doesn't have to do anything special, except for the special cases I listed above, which are all due to missing type annotations. Having unreachability spreading into the decoding logic for immediates or another mode for iterating bytecodes seems like adding complexity.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 17, 2020

Hiding everything in the various pop operations has the distinct advantage that the core logic for each bytecode generally doesn't have to do anything special, except for the special cases I listed above, which are all due to missing type annotations.

The type annotations are only "missing" to the extent that they facilitate the current typing approach. They wouldn't be necessary in the JVM because it never has to deal with polymorphic stacks. Similarly, they wouldn't be necessary in this proposal. So another way of looking at it is that the current approach causes overhead in type annotation.

Having unreachability spreading into the decoding logic for immediates or another mode for iterating bytecodes seems like adding complexity.

There's a complexity trade-off - but I think the balance is heavily on the side of this proposal. You can't encapsulate the current typing logic purely in pop operations because of select, which needs to be special-cased (and the obvious solutions cause the representation of the type stack to become more complicated).

It's possible to factor the "skipping bytecode" logic so that it's clear what the "normal case" for each bytecode is.

@titzer
Copy link

titzer commented Sep 18, 2020

Actually, I think validating immediates but not types would be the worst of both worlds because skipToEnd would duplicate that logic. Most immediate forms are just i32 or u32 indices, which are easy to skip as a single case.

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 18, 2020

Actually, I think validating immediates but not types would be the worst of both worlds because skipToEnd would duplicate that logic. Most immediate forms are just i32 or u32 indices, which are easy to skip as a single case.

Not the worst of both worlds! Far better than what we have currently, whatever we decide for this corner-case.

When I was looking through implementations, normally immediates (e.g. u32) were read and checked simultaneously by some readU32 function which specialised an underlying readLEB. It's not clear to me that it would simplify skipToEnd much to replace occurrences of readU32 with readLEB. In fact, it would probably make it harder for skipToEnd to share code with the regular decode+validate function (e.g. through something like V8's current templating approach).

EDIT:
Skipping immediate range checks would also require more spec refactoring. I'm not denying that implementing skipToEnd requires some engineering, but it's confined to a pretty specific and well-behaved area of the implementation, and the effort has to be weighed against the costs we're incurring with the current approach (a validation algorithm severely complicated by polymorphic cases that only occur in dead code, and having to thread the "dead code" needle with the design and implementation of each new feature).

@eqrion
Copy link

eqrion commented Sep 18, 2020

I've also had negative experiences with the polymorphic stack typing rules, and tend to think they're more trouble then they're worth. For SM, this was mentioned, but we'd likely have to refactor our decoding and validation code significantly as they're fused together in a not-clean manner.

This is unfortunate, but if it could cut off these typing issues for good then I'd lean in favor of it. I recently did more work in this area with the f-r proposal and this logic is no longer easy to abstract away with stack operations like @titzer mentioned.

My ideal solution would be to treat all unconditional branches as equivalent to an 'end' instruction that terminates a block, effectively eliminating the definition of unreachable code we're working with here. Obviously that's a breaking change, but it'd make decoding/validation much easier.

@taralx
Copy link

taralx commented Sep 19, 2020

Does anyone have concrete data on how often dead code is actually found in modules? Is there an existing producer that can output dead code?

@kripken
Copy link
Member

kripken commented Sep 19, 2020

@taralx I don't have comprehensive data, but I see unreachable code in wasm files on the web in practice, like a double unreachable, stuff like that. Many of those are on demos and examples, though, and some are simply not optimized builds.

I don't think any producer has a guarantee of not emitting dead code. It would be a bug in LLVM or Binaryen if they did, at least in optimized builds, but I'm not aware of work to fuzz them to make sure they have no dead bytes at all (we do measure code size a lot, though, so any such amount would be tiny - but it might exist).

@taralx
Copy link

taralx commented Sep 19, 2020

Ok, but it does happen, so simply banning dead code would risk enough breakage that it isn't worth it. Thanks.

conrad-watt added a commit to WebAssembly/meetings that referenced this issue Sep 23, 2020
This can be bumped to the next meeting if necessary due to timing, but it's highly related to Andreas' `select` typing issue, so I'm hoping to piggy-back on his explanation of the issue (WebAssembly/design#1379). This proposal is for the smaller version of the change described in the linked issue.

My vision would be to have an relatively uncontroversial phase 1 vote this week to create the proposal repo, and then a much longer presentation/discussion next meeting with a possible phase 2 vote (the spec changes are small - we mainly need commitment from implementers on the decode/validation changes to advance further).

Note that it's fine for Andreas' proposed typing change to land first - we're not proposing to make reference types depend on this proposal.
binji pushed a commit to WebAssembly/meetings that referenced this issue Sep 23, 2020
This can be bumped to the next meeting if necessary due to timing, but it's highly related to Andreas' `select` typing issue, so I'm hoping to piggy-back on his explanation of the issue (WebAssembly/design#1379). This proposal is for the smaller version of the change described in the linked issue.

My vision would be to have an relatively uncontroversial phase 1 vote this week to create the proposal repo, and then a much longer presentation/discussion next meeting with a possible phase 2 vote (the spec changes are small - we mainly need commitment from implementers on the decode/validation changes to advance further).

Note that it's fine for Andreas' proposed typing change to land first - we're not proposing to make reference types depend on this proposal.
@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 23, 2020

The "short-term" version of this proposal will (time permitting!) be discussed for phase 1 in the next CG meeting. We're not looking for firm commitment from implementers yet; we just want to create the proposal repository in order to present a consolidated rationale + concrete spec changes.

@rossberg
Copy link
Member

@fitzgen:

I was arguing that the fuzz bug count implied additional complexity,

Perhaps, but my point was that that is likely to be true for any form of checks, so is problematic as an argument in itself.

and that the complexity wasn't carrying its own weight in this case.

Right, and that is the discussion we should focus on.

Can you explain how type checking unreachable code provides extra safety? (I re-skimmed this thread and didn't see any relevant links to historical discussions on this particular topic, apologies if I missed something!)

I believe I mentioned it a couple of times: as I remember it from the early meetings, the safety/security concern (not mine) was that malformed dead code could be part of an attack vector (e.g., if an engine mishandles jumps somehow and allows executing such code although it never should; if such code is at least validated, the amount of harm that can be done that way is more confined). So, reducing the risk of single points of failure. The discussions happened in the early days before we got into the habit of note taking or written paper trails, so nothing I could point to. But some folks here might remember the specific safety/security angle better than I.

@RossTate
Copy link
Author

My understanding is that the primary (sole?) concern has to do with the uncertainty about what instructions should be (in)valid in the unreachable state. I have been confused by this concern, and some examples would have helped, but I think I finally figured out how to explain my confusion and at the same time address the issue.

Here is the code for pop in the reference interpreter (lines 101-104, unchanged by WebAssembly/reference-types#116):

let push (ell1, ts1) (ell2, ts2) =
  assert (ell1 = NoEllipses || ts2 = []);
  (if ell1 = Ellipses || ell2 = Ellipses then Ellipses else NoEllipses),
  ts2 @ ts1

Change those last two lines to

  if ell1 = Ellipses || ell2 = Ellipses then (Ellipses, []) else (NoEllipses, ts2 @ ts1)

That change implements this proposal (and is compatible with the changes in WebAssembly/reference-types#116, though they now will have no observable effect).

What this change does is make it so that, once we reach the unreachable state (represented as (Ellipses, []) in the reference interpreter), then every subsequent instruction will be type-checked in that state (until the end of the enclosing block is reached). In other words, we already have consensus on whether a single instruction following the unreachable instruction is valid, and this proposal simply says to validate all instructions until the end of the enclosing block as if they immediately followed the unreachable instruction.

Note that there is also another way to implement this change. Rather than represent the stack type as ellipses * value_type option list (as in line 73), one could represent the stack type as (value_type list) option. One could then either continue to use the push/pop abstraction (using lists of value_type option), or one could take a bimodal approach and branch on the state of the stack type. This illustrates that this proposal offers more flexibility on how to implement type-checking, whereas WebAssembly/reference-types#116 bakes a particular implementation strategy into the spec.

As for the spec, I think the following changes are all that are necessary to achieve the above:

  1. Introduce a metavariable to represent stack types and change stack-polymorphic rules to use this metavariable. (This just makes the presentation more extensible and itself has no effect on validation.)
  2. Add an unreachable state type to the grammar for that metavariable. (On its own, this also has no effect on validation.)
  3. Add a rule that says an instruction can have the unreachable input state type and an arbitrary output state type if there exist input and output state types that would make the instruction type-check. (This single rule is the only real change to validation.)

Hopefully this illustrates how straightforward this change is to the spec; it only appears to have potential for complications because the spec did not use a metavariable to represent stack types.

@rossberg
Copy link
Member

rossberg commented Sep 30, 2020

The reference interpreter is meant to be an executable version of the spec. As such, it separates decoding from validation, like the spec, but unlike real engines. Its validator also tries to stick as close to the structure of the declarative typing rules from the spec as is possible, very unlike real engines, which all use a version of the forward algorithm from the appendix. So you can't draw any meaningful conclusions about the practical impact of this change from the interpreter.

@RossTate
Copy link
Author

@rossberg The implementers can speak for themselves. (So far, from what I can tell every implementer has been supportive of the overall change.) Did the comment above address your own concerns?

@conrad-watt
Copy link
Contributor

conrad-watt commented Sep 30, 2020

I agree that the real point of interest here is the extent to which real implementations that fuse decoding, syntactic checks, and validation are able to handle this change. We've heard some voices in support, but I agree that more discussion is needed to make sure everyone is on the same page about what the changes would entail.

EDIT: FWIW I plan to lay more of this out when our proposal repo is created

@rossberg
Copy link
Member

rossberg commented Sep 30, 2020

@RossTate, I don't understand your response. My comment is pointing out that your analysis of the reference interpreter isn't implying anything about other implementations. I'm not implying anything about other implementations myself in that comment.

@RossTate
Copy link
Author

I understand. And I did not say anything about other implementations in my comment as well. But my comment was primarily aimed at addressing your concerns, so I was hoping you would comment as to whether it did so effectively. If your only concern is that others might have some difficulty implementing this proposal, then I would like to hear about real issues from them rather than hypothesize. So can you clarify if you have any other remaining concerns?

@conrad-watt
Copy link
Contributor

Just to preempt a response from @rossberg, I feel that we have a strong case that the burden on real implementations will not be too great, but I'd rather lay it out in full in the proposal repo and have the discussion continue from there, rather than expend (probably wasted) effort on a back-and-forth in the bowels of this initial issue.

@RossTate
Copy link
Author

I agree that it's a good plan to discuss implementation specifics in the full proposal repo. I also would find it useful to know, though, if there are any concerns besides implementation specifics.

@rossberg
Copy link
Member

rossberg commented Oct 1, 2020

@RossTate, the change to spec -- and therefore the reference interpreter -- are not the issue, they would be fairly straightforward and akin to what's in WebAssembly/reference-types#116, except that the bottom type isn't added to opdtype but to stacktype directly, and there is a subsumption rule between bottom and regular stack types (merely having a special sequencing rule is not enough, since bottom must also match all types at the end of a block, expression, or function).

To repeat my concerns:

  • Formally, this semantics has slightly less nice properties, because composability of well-typedness is no longer invertible. This is not a big deal, though.

  • Security-wise, we potentially lose one level of defence against certain attacks. This certainly was a significant concern last time round, but it is difficult to quantify.

  • Conceptually, the change makes the parsing/validation distinction observable. That takes away a relevant degree of freedom from both spec and implementations. And ensuring that impls get it right will require a lot of subtle tests, even more so than under the current semantics. (Related, some changes we did in proposals would technically become breaking changes under this semantics, e.g., reinterpreting the alignment immediate of load/store instructions as a bit field in the multi memory proposal.)

  • Practically, this change may hence be counterproductive for some implementations, i.e., make things harder not simpler. We have at least anecdotical evidence for this through previous complaints from implementers about the test suite making the distinction, even though they were allowed to ignore it so far (and did).

The last two points are the ones where I believe you severely underestimate the real-world complexities and the amount of churn this will produce. The devil is in the details on this one. As a former engine implementer I assume some expertise in making such predictions. ;)

With that all said, the group seems to think that this is time well spent at the moment, and I am interested in seeing the proposal explored. I just hope it won't tie our hands too much in the future.

@tlively
Copy link
Member

tlively commented Oct 1, 2020

@rossberg

(Related, some changes we did in proposals would technically become breaking changes under this semantics, e.g., reinterpreting the alignment immediate of load/store instructions as a bit field in the multi memory proposal.)

This sounds concerning and I would like to understand it more. Can you elaborate on this example and explain why it would be become a breaking change under this proposal? It would be good to have a dedicated discussion on this point on an issue at https://github.com/WebAssembly/relaxed-dead-code-validation.

@conrad-watt
Copy link
Contributor

conrad-watt commented Oct 1, 2020

Andreas' point is this:
the original encoding for (e.g.) i64.load is
0x29 memarg
where
memarg := a:u32 o:u32

however, because 2^a is never allowed to exceed the natural byte alignment of the largest load (i.e. a must be leq 3) as a validation-time restriction, any higher-order bits are free to use as a flag.

In the multi-memory proposal, if the MSB of the first LEB byte of a is set to 1 (which would previously have been a validation error because it would make a too big), then decoding of the load consumes an extra immediate which represents the memory index.

If this proposal were naively adopted, a load with this bit set would not have been a validation error in dead code, so this trick wouldn't be/have been backwards-compatible.

It's a good point and I have it on my radar for discussion. I plan to write up something addressing this once I've tested the existing behaviours of implementations a little more stringently.

@tlively
Copy link
Member

tlively commented Oct 2, 2020

Makes sense, thanks!

@titzer
Copy link

titzer commented Oct 2, 2020

My initial warming to this proposal was under the assumption that skipToEnd also skipped validation of indices, which is not what is being proposed. As such, it is more decodeToEnd with validation of immediates. In a typical engine decoder designed for speed, either of the two would probably be better implemented with a subroutine and not mixed into the bytecode-by-bytecode cases. The decodeToEnd routine is a little messier so I am less warm than before. It pretty much necessitates a table mapping bytecodes to immediate kinds rather than duplicating the big switch over bytecodes. If your decoder already has that mapping and is factored to deal with "immediate kinds", it's not so invasive of a change.

But since we're splitting hairs here:

The approach of trying to hide everything behind pop operations starts to break down and is very clunky in the face of bytecodes like return_call_ref which pops a typed funcref off the stack and must check that its return types match the return types of the enclosing function. I'm not sure how you handled that in the reference interpreter but that was the point where I basically gave up and directly return a pair from pop(), one of whose elements indicate the value type returned is invalid and instead the stack is unreachable (or a maybe value type if you prefer to think of it that way). Other bytecodes that are particularly weird are call_ref and func.bind (perhaps in these cases just returning a function [] -> [] is OK? that basically skips them...). Also ref.is_null, ref_as_non_null...the general trend is that any bytecode whose stack effects (not just result type) is partially or wholly determined by the types of its input operands, and not derivable solely from immediates, needs special consideration. It might seem natural with some tricks in the formalism, but every single one of those is a head-scratcher upon first inspection.

If immediates are validated, I don't understand how that makes the distinction between decoding and validation observable. Can you explain that, @rossberg ?

@conrad-watt
Copy link
Contributor

conrad-watt commented Oct 2, 2020

It pretty much necessitates a table mapping bytecodes to immediate kinds rather than duplicating the big switch over bytecodes. If your decoder already has that mapping and is factored to deal with "immediate kinds", it's not so invasive of a change.

From my inexpert look at implementations, at least the browser engines already seem that way.

If immediates are validated, I don't understand how that makes the distinction between decoding and validation observable. Can you explain that, @rossberg ?

It's not observable in the sense that a distinguished error message is required, but (the current iteration of) the proposal would make it observable whether a particular immediate check is spec'd as part of the instruction syntax or as part of validation.

e.g. with the current proposal

function [] -> []
(locals i32 i32 i32)
local.get 50
unreachable

the above would give an error

function [] -> []
(locals i32 i32 i32)
unreachable
local.get 50

the above would not be an error, while both

function [] -> []
(locals i32 i32 i32)
local.get (2^40)
unreachable

and

function [] -> []
(locals i32 i32 i32)
unreachable
local.get (2^40)

would give errors.

That being said, I'm leaning towards a revised proposal (which I was just typing up an issue in the repo on!) where the distinction isn't between "instruction syntax vs validation" checks (which I agree are driven entirely by how the spec is factored), but between "type stack-dependent and type stack-independent" checks, where only "type stack-independent" checks are performed in dead code. This would make dead code strictly more checked that this iteration of the proposal, but would still avoid a polymorphic stack (all of the above examples would be errors with this proposal).

This would potentially require more change to the formalism of the spec, but avoids creating gotchas like the load example Andreas gave, while still addressing the motivating desires/frustrations of implementers. I think both iterations of the proposal would be roughly the same difficulty to implement.

@RossTate
Copy link
Author

RossTate commented Oct 2, 2020

between "type stack-dependent and type stack-independent" checks, where only "type stack-independent" checks are performed in dead code

This sounds like what I suggested above, which is why I was confused @rossberg still had objections.

This would potentially require more change to the formalism of the spec

I think the single rule I gave above captures this (provided the change to describing stack types via a metavariable.)

@conrad-watt
Copy link
Contributor

conrad-watt commented Oct 2, 2020

I think the single rule I gave above captures this (provided the change to describing stack types via a metavariable.)

This is enough to deal with the MVP instructions, but something more is needed to deal with post-MVP instructions such as (rtt.sub <type> (<rtt>)), where the comparison between its immediate type parameter and its stack operand would only be carried out if the input stack is not unreachable.

EDIT: @titzer's "pop-abstraction-breaking" instructions are probably better examples (#1379 (comment)). We're continuing this conversation in the proposal repo (WebAssembly/relaxed-dead-code-validation#1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants