Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CS2] Destructuring #4478

Merged
merged 21 commits into from
Apr 6, 2017
Merged

[CS2] Destructuring #4478

merged 21 commits into from
Apr 6, 2017

Conversation

GeoffreyBooth
Copy link
Collaborator

The intent of this PR is to output CoffeeScript’s destructuring as ES2015+ destructured syntax whenever possible. Migrating the task from coffeescript6/discuss#69, this:

numbers = [1, 2, 3]
pairs = { a: 1, b: 2, c: 3 }

[one, two, three] = numbers
{one, two, three} = pairs

foo = ->
  [@a, @b, @c] = numbers
  {@a, @b, @c} = pairs
  return

would compile to this:

var foo, numbers, one, pairs, three, two;

numbers = [1, 2, 3];

pairs = {
  a: 1,
  b: 2,
  c: 3
};

[one, two, three] = numbers;
({one, two, three} = pairs);

foo = function() {
  [this.a, this.b, this.c] = numbers;
  ({a: this.a, b: this.b, c: this.c} = pairs);
};

The array destructuring, as you can see here, is pretty straightforward; but the object destructuring presents challenges. First, if we’re not declaring the variable at the same time (as we never do in CoffeeScript, hence the var line of declarations at the top of each scope) then we need to wrap the destructuring assignment in parentheses, per MDN. And if the target variable is attached to this, we need to use the “assigning to new variable names” shortcut to destructure into our intended destination variable, unless people have a better suggestion for how to handle this.

There is an intentional breaking change in that CS2 destructuring, like CS2 function default parameters, will apply default values only when a variable is undefined, not falsy.

@GeoffreyBooth GeoffreyBooth added this to the 2.0.0 milestone Mar 29, 2017
@GeoffreyBooth
Copy link
Collaborator Author

This is really preliminary, but I thought I’d create this PR just to start the conversation. I could use a lot of help with this. 😬 @zdenko has done some great work with object destructuring; I would love if you could also help tackle the core of ES destructuring!

I didn’t want to commit any code that horribly breaks the tests, so all I’ve committed so far are various helpers that I think will be necessary in order to get Assign to produce destructured output at least for the simplest case, {a} = b.

So I can get that case to output correctly, as:

var b;

({ b } = a);

if I change nodes.coffee:1773-1782 to the following:

      acc   = idx.unwrap() instanceof PropertyName
      value = new Value value
      value.properties.push new Index idx unless acc
      message = isUnassignable obj.unwrap().value
      obj.error message if message
      value = new Op '??', value, defaultValue if defaultValue
      assignOptions =
        param: @param
        wrapVariableInBraces: acc
      return new Assign(obj, value, null, assignOptions).compileToFragments o, LEVEL_TOP

but doing that causes 6 tests to fail, and those tests verge on incomprehensible to me.

I’d like to start small and gradually expand to more and more complicated destructuring cases, but it appears that destructuring is implemented in a very compact, recursive way that discourages piecemeal approaches. Perhaps @lydell you can help untangle it? I’m hoping to avoid the near-rewrite that I did for the function parameters. At least function parameters are a straightforward concept that I’m pretty sure I understand; destructuring can get awfully complicated . . .

@GeoffreyBooth
Copy link
Collaborator Author

Okay, I’ve made a lot of progress. The tests now pass, and simple cases like {a} = b now output in ES syntax. Any feedback on the approach so far?

@GeoffreyBooth
Copy link
Collaborator Author

Update: The branch I started this PR with has been renamed destructuring-geoffrey and @connec’s branch that he submitted #4479 with has become my repo’s destructuring, and therefore the source for this PR.

@GeoffreyBooth
Copy link
Collaborator Author

So picking up from #4479, I think @connec’s approach for destructuring is more promising than the one I was pursuing, so I want to start from his branch and cherry-pick improvements from mine. Top on my list is default values. @connec did you have an implementation in mind for those? How do you feel about the way default values are implemented in my branch, and does that complement your approach?

@connec
Copy link
Collaborator

connec commented Mar 31, 2017

Copied from #4479 -

I'll take another look tonight.

The main approach I'm going for is to use as much of the existing machinery as possible. If you look at the nodes for destructured expressions they look like they should just about compile verbatim, and it's only guard logic in nodes.coffee that's preventing that. With some small tweaks we should be able to update the guard logic to be more permissive (e.g. here).

For compiling default values, I was planning to just leverage the existing node structure, e.g.

$ echo '{ a = 1 }' | bin/coffee -bns
Block
  Assign
    Value
      Obj
        Assign
          Value IdentifierLiteral: a
          Value NumberLiteral: 1
    Value IdentifierLiteral: b

We should be able to tweak Obj so that it will compile that inner assign verbatim, rather than doing special handling for Assign's whose context isn't object.

By destructured parameters I just mean e.g. ({a}) ->, which on my branch compiles to

(function(arg) {
  var a
  ({a} = arg);
});

...but could be compiled to

(function({a}) {
});

Again, I suspect there's some small logic tweak we could make to the parameter handling to make them compile more things in-line.

@zdenko
Copy link
Collaborator

zdenko commented Mar 31, 2017

I've merged @connec's branch into mine and made some progress with the rest element in object destructuring. I'll try to fnish a.s.a.p and push changes.

It seems to be working fine, and currently, this:

obj = {a:1, b:2, c:3, d:4, e:5}
{a, b:z, d, r...} = obj

compiles to:

var a, d, obj, z, r;

  obj = {
    a: 1,
    b: 2,
    c: 3,
    d: 4,
    e: 5
  };

  ({
    a,
    b: z,
    d
  } = obj), r =  Object.keys(obj).reduce((a,c) => (!['a','b','d'].includes(c) && (a[c] = obj[c]), a), {});

@GeoffreyBooth
Copy link
Collaborator Author

Nice work! I feel like destructuring is finally coming along.

One note: should your generated output be using =>? I know it's common style in ES, but a coffee lint rule that I tend to follow says not to use the bound/fat arrow unless you need it, i.e. unless you're using this inside the function.

@connec
Copy link
Collaborator

connec commented Apr 1, 2017

@GeoffreyBooth I've updated my branch with compilation of default values. The implementation was fairly smooth. I don't seem to be able to push it here, however (no permissions?).

@GeoffreyBooth
Copy link
Collaborator Author

@connec did you try to push it to GeoffreyBooth:coffeescript/destructuring? You're a collaborator there, you should be able to push. This PR is based on that branch on my repo.

@connec
Copy link
Collaborator

connec commented Apr 1, 2017

Ah yes, it was my bad - I had your remote set as HTTPS but my authentication is SSH. I've updated the branch!

src/nodes.coffee Outdated
@@ -1104,47 +1115,77 @@ exports.Obj = class Obj extends Base

children: ['properties']

isAssignable: ->
@lhs = true
Copy link
Collaborator

@connec connec Apr 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hacky, but I wasn't sure how else to deal with {a=1} being valid only when destructuring, since it's otherwise indistinguishable from within Obj#compileNode.

The other approach I considered was adding a "destructuring"/"assignment"/"lhs" option to o - I suspect this might be better, but felt like a 'bigger' change than a new property on Obj.

class Assign
  ...
  compileNode: (o) ->
    o.assignment = true
    ...

class Obj
  ...
  compileNode: (o) ->
    ...
    prop.operatorToken.error "unexpected #{prop.operatorToken.value}" unless o.assignment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this. Otherwise though, what else remains to be done?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's pretty close... I'd like to run it against a codebase or two and maybe add a few tests.

Would you prefer it if desrtuctured parameters went in this PR, or would you be happy for them to be tackled separately?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see why you think lhs is hacky; surely it’s better to isolate this object-specific property to the Obj class? Seems okay to me.

I think we should probably tackle destructured function parameters in this same PR, because basically we will need to be adding special cases to the same places you just modified in this PR to cover the things that you can normally destructure but not in a function parameter list, like this. I understand the appeal of merging this in and then using that as the new baseline, and only seeing the changes related to function parameters in a new PR, but since the function-parameters work is so related to this work I think it would be more useful to still see these changes. I don’t feel strongly though. There’s also GeoffreyBooth#4 to merge in.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait a minute . . . is @lhs hacky because it’s getting set via the isAssignable call? And you’re relying on the fact that isAssignable is only called when Obj is part of an Assign and therefore the left hand side of a destructuring assignment?

If that’s the case, then yeah, this is really hacky. isAssignable is not the name for a method that should be mutating its class. We should find a better way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, precisely. Unfortunately it's the only reliable way I could think to do it without:

  • Traversing the tree in Assign (e.g. @variable.traverseChildren). This is a bit awkward and probably bad for performance.
  • Putting a property on o (e.g. o.assignment = true). This would be pretty seamless, but the o API can be hard to understand. This is probably the best alternative though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tackled this and ended up with 05c1f46. I traced how lhs was getting set in your version (i.e., where was the isAssignable call coming from) and just set the property from the parent at that point. The end result came in rather complicated, though, because of the cases of objects within arrays, and of destructuring that originates from function parameters and not Assign. I think the performance should be the same as your version, at least.

I’m sure you could find an even more elegant way. I would at least prefer this over an isAssignable method that mutates; but feel free to try to do better still 😄

lydell
lydell previously requested changes Apr 3, 2017
@@ -141,9 +141,6 @@ test "#1192: assignment starting with object literals", ->

# Destructuring Assignment

test "empty destructuring assignment", ->
{} = [] = undefined

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@connec removed that, but I suspect because it’s invalid ES.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if I restore the test, it outputs as:

({} = void 0);

But Node doesn’t like it, or the less-mutated version:

node -e '({} = void 0)'
[eval]:1
({} = void 0)
    ^

TypeError: Cannot match against 'undefined' or 'null'.
    at [eval]:1:5
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at Object.runInThisContext (vm.js:95:38)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:571:32)
    at evalScript (bootstrap_node.js:387:27)
    at run (bootstrap_node.js:120:11)
    at run (bootstrap_node.js:423:7)
    at startup (bootstrap_node.js:119:9)
    at bootstrap_node.js:538:3


node -e '({} = [] = undefined)'
[eval]:1
({} = [] = undefined)
           ^

TypeError: Cannot read property 'Symbol(Symbol.iterator)' of undefined
    at [eval]:1:12
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at Object.runInThisContext (vm.js:95:38)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:571:32)
    at evalScript (bootstrap_node.js:387:27)
    at run (bootstrap_node.js:120:11)
    at run (bootstrap_node.js:423:7)
    at startup (bootstrap_node.js:119:9)
    at bootstrap_node.js:538:3

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s just a runtime error, just like this:

$ node -e 'foo()'
[eval]:1
foo()
^

ReferenceError: foo is not defined
    at [eval]:1:1
    at ContextifyScript.Script.runInThisContext (vm.js:26:33)
    at Object.exports.runInThisContext (vm.js:79:17)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:571:32)
    at Immediate.<anonymous> (bootstrap_node.js:383:29)
    at runCallback (timers.js:649:20)
    at tryOnImmediate (timers.js:622:5)
    at processImmediate [as _immediateCallback] (timers.js:594:5)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or keep the CoffeeScript 1 compilation:

$ coffee -bpe '{} = [] = undefined'
void 0;

But that seems not very useful.

Copy link
Collaborator

@vendethiel vendethiel Apr 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CS1 uses those for non-destructuring cases, though

coffee -bce> [v, []] = a
var v;

v = a[0], a[1];

Although it's... not very consistent...

coffee -bce> [[]] = a
a[0];

But sometimes useful

coffee -bce> ->
  return [] =
    a: b
(function() {
  return {
    a: b
  };
});

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, I did remove the test because destructuring undefined or null is a runtime error. I think we should be able to restore it with an empty array/object and it should still work... I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the test currently fails, because it compiles to:

function () {
  return ({} = void 0);
}

which Node doesn’t like:

TypeError: Cannot match against 'undefined' or 'null'.

Adding = [] in there generates the other error I saw before:

TypeError: Cannot read property 'Symbol(Symbol.iterator)' of undefined

So I guess the question is, so what? There’s no way to make this test “pass,” since the reasonable JavaScript that a user would expect to get generated ({} = [] = undefined) also throws a runtime error. Having the compiler generate anything else feels like cheating.

So either we try to have the compiler catch this error, and the test becomes whether we do catch it; or we rewrite the test to test something else that wouldn’t be a runtime error in Node. I would lean toward the latter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lydell for now I changed the test to your suggestion:

[] = []
{} = {}

and it passes. This is probably a worthwhile test on its own. Are there any other tests you’d like to add?

eq 1, b
eq 2, c
) {a: [1], c: 2}

context = {}
(([{a: [b, c = 2], @d, e = 4}]...) ->
(({a: [b, c = 2], @d, e = 4}) ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests pass with or without this change. I just thought the revised version was a more realistic test. We could test both versions if you want, but the []... syntax is still tested in the #4005: `([a = {}]..., b) ->` weirdness test, so I’m not sure we need to test it here too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @lydell, do you want me to revert these tests? They pass either way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They've been like this since 2010. Let's keep them that way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

@@ -150,13 +150,13 @@ test "@-parameters and splats with constructors", ->
eq b, obj.last

test "destructuring in function definition", ->
(([{a: [b], c}]...) ->
(({a: [b], c}) ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is acceptable. There are lots of things in programming languages that people never write, but still need to be supported for consistency.

@GeoffreyBooth
Copy link
Collaborator Author

@lydell are you okay with these revisions?

I’d like to get this merged in and tackle GeoffreyBooth#4 on its own.

@lydell
Copy link
Collaborator

lydell commented Apr 6, 2017

Could you post two examples where we fall back to the old code generation, and what they compile to? One for objects and one for arrays. Just want to think how they differ, and if they are a potential gotcha. Otherwise LGTM.

@GeoffreyBooth
Copy link
Collaborator Author

Sure. As far as I can tell the old code is only being used for splats and expansions, and for function parameters after splats or expansions:

// [x,y...,z] = [a,b,c,d,e]
var i, ref, x, y, z,
  slice = [].slice;

ref = [a, b, c, d, e], x = ref[0], y = 3 <= ref.length ? slice.call(ref, 1, i = ref.length - 1) : (i = 1, []), z = ref[i++];


// f = (a, b..., {c}) ->
var f,
  slice = [].slice;

f = function(a, ...b) {
  var c, i, ref;
  ref = b, b = 2 <= ref.length ? slice.call(ref, 0, i = ref.length - 1) : (i = 0, []), ({c} = ref[i++]);
};

Until GeoffreyBooth#4 lands there are no splats or expansions within objects, so I guess we’re not using the old code at all for any object destructuring.

@lydell
Copy link
Collaborator

lydell commented Apr 6, 2017

I guess the gotcha is destructuring generators, then?

$ node
> g = function *(){yield 1; yield 2; yield 2}
[Function: g]
> [a, ...r]=g()
{}
> a
1
> r
[ 2, 2 ]

I'm thinking that [a, r...] = g() would pull values out of the generator, while [a, r..., b] = g() wouldn't? If so, I'm not sure what to do about it other than docs?

@GeoffreyBooth
Copy link
Collaborator Author

That isn’t really a fault of this PR, as there’s no destructuring there; but it’s something we should consider. Basically “non-final” function parameters fall back to the legacy compilation, which involves caching the result of a function call, e.g.:

// [a, r…, b] = g()
var a, b, i, r, ref,
  slice = [].slice;

ref = g(), a = ref[0], r = 3 <= ref.length ? slice.call(ref, 1, i = ref.length - 1) : (i = 1, []), b = ref[i++];

Because of the ref = g(), g only gets called once here. In your example, I guess …r calls g until g is exhausted? What if g could generate infinitely?

I guess the “solution” would be to once again restrict ourselves to ES semantics, and only allow splats in the final parameter. But I don’t want to do that for such a fringe edge case as this . . . I guess a note in the docs, yeah.

Anyway, leaving function parameters aside, how’s the destructuring? 😄

@lydell
Copy link
Collaborator

lydell commented Apr 6, 2017

In your example, I guess ...r calls g until g is exhausted? What if g could generate infinitely?

g is called exactly once. g() returns a generator. ...r does take the next value of until the generator is exhausted. If the generator is infinite, you simply end up in an infinite loop. Note that other than using array destructuring, you can also use the .next() method on generators to pull values out of them.

I'm not really sure I understood the rest of your comment, but I think we're good to go on this PR.

@GeoffreyBooth
Copy link
Collaborator Author

I’m not really sure I understood the rest of your comment

I just mean that splats as the final parameter compile nicely into ES:

// (a, b, c...) ->
(function(a, b, ...c) {});

But splats in any non-final position aren’t supported in ES, and so we need to do some shenanigans:

// (a, b..., c) ->
var slice = [].slice;

(function(a, ...b) {
  var c, i, ref;
  ref = b, b = 2 <= ref.length ? slice.call(ref, 0, i = ref.length - 1) : (i = 0, []), c = ref[i++];
});

And these shenanigans aren’t equivalent to ES in the case of handling generators, as you’ve pointed out. I’m not sure there’s a way to make them equivalent without a runtime check that the assignment target is a generator function as opposed to a regular function. This maybe deserves an issue of its own, even if we decide it’s too fringe of an edge case to address.

@GeoffreyBooth GeoffreyBooth merged commit b192e21 into jashkenas:2 Apr 6, 2017
@GeoffreyBooth GeoffreyBooth deleted the destructuring branch April 6, 2017 17:09
@jashkenas
Copy link
Owner

You guys are really cooking on this — bravo!

@connec
Copy link
Collaborator

connec commented Apr 11, 2017

And these shenanigans aren’t equivalent to ES in the case of handling generators

Will that not actually work fine? Generator unpacking only happens at the call-site anyway, so these all work identically to ES2015 equivalents:

g = ->
  yield i for i in [1..3]
  return

f = (a, b..., c) ->
  return [a, b..., c]

console.log f g()
# [g, undefined]

console.log f g()...
# [1, 2, 3]

console.log f 0, g()..., 4
# [0, 1, 2, 3, 4]

@helixbass
Copy link
Collaborator

@GeoffreyBooth I just ran into something that broke (I assume unintentionally) as a result of this pull request: eg ({a: {b = 1}}) -> compiled ok before this PR but gives error: unexpected = after. This destructuring also breaks in a destructured assignment: {a: {b = 1}} = c also gives error: unexpected =

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants