use the parser to disambiguate valid function parameters, fixing several miscompiles and ICEs #5474
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
The CoffeeScript compiler currently accepts a wide variety of invalid parameter lists, successfully generating js code which then errors upon evaluation:
For example, see how the second input is processed by node:
It even produces an internal compiler error upon certain inputs:
Current Approach
Right now, we address this piecemeal, by identifying post hoc which types of param names and values are valid.
For param names:
coffeescript/src/nodes.coffee
Lines 4340 to 4341 in 817c39a
For (literal) param values:
coffeescript/src/nodes.coffee
Lines 4377 to 4378 in 817c39a
This approach is actually better than my proposal in several ways (see below), but it's a bit of a whack-a-mole, as this approach can only identify negative matches (invalid inputs), as opposed to defining a set of valid inputs (which is the job of the parser). This leads to my proposal.
Solution
It turns out that js function parameters (and their capability for destructuring assignment) are highly constrained in comparison to destructuring on the left-hand side of a "normal"
=
assignment expression (one which does not occur within a function parameter binding context). This propagates to CoffeeScript, and we can actually identify valid parameter assignments at parse time, by adding a parallel set of grammar rules specific to function parameters.Comparison to Alternatives
This change strictly modifies the Jison grammar declaration, and does not otherwise modify any semantics: it produces equivalent output nodes, but from a more restricted range of inputs. This means it only errors on inputs that (I claim) it should already be rejecting, but it does it earlier instead of later.
There are further comparisons we can make:
Negative Matching
Compared to negative matching against unassignable nodes (the current approach):
For some examples of this tradeoff, take the following test cases in
error_messages.coffee
modified in this PR (abbreviated below):On
main
:After this change:
As you can see above, there is a bigger issue with
({@param: null}) ->
than the use ofnull
: it's also incorrectly using a shorthand@param
as an object property name! This is actually valid as a computed property name with({[@param]: x}) ->
, but the negative matching approach currently doesn't have the logic to disambiguate those two.However,
unexpected :
is much less informative to the user, so this approach would give up some of our ability to craft useful explanations of why something isn't a valid function parameter. For example, all it says for({a: null}) ->
is thatnull
was "unexpected", as opposed to explaining that this is becausenull
is a keyword.It's not clear to me how to improve error messages with this parsing-based approach. We avoid miscompiles, but we also lose some explanatory capacity. One way we could mitigate this would be to add some documentation about the restrictions on function parameter destructuring to the documentation.
I still believe this should be considered backwards-compatible, as it only rejects code that would have silently miscompiled. However, it may cause a build pipeline to break, for code which would otherwise not cause errors if it was never evaluated. I'm not sure if CoffeeScript's stability contract covers compile-time errors for code that would definitely have errored if ever executed, but if so, we could consider putting this behavior behind a flag or environment variable. It will not reject any previously-valid code.
Hygienic Code Generation
One final alternative I'll mention to miscompiles is instead restructuring our code generation to avoid miscompiles. This might vaguely take the best of both prior approaches, as it lets us craft custom error messages explaining how to fix an issue, but allows us to avoid whack-a-mole negative matching logic.
For example, for the case of
({["x"]}) -> x
, we could ensure that our destructuring expressions can only bind to expressions of a certain type. In the parsing-based approach from this PR, we currently achieve this via the grammar rule:We could accomplish something similar at the point of code generation, possibly by editing
Param#asReference()
fromnodes.coffee
. However, this has one downside. Refer to our ICE from before:While it's true that we could fix this individual ICE and still fix this larger issue in codegen, the reason why the ICE occurs here is because the compiler is expecting a specific type of input for function parameters, has a lot of assumptions about it, and essentially has to redo the work of the parser in order to fully validate the input. It seems to make sense to use the parser to solve this problem.