Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect output due to parens messed up after expression was parsed #41

Closed
CherryDT opened this issue Apr 13, 2018 · 17 comments
Closed
Assignees
Labels
Type: Bug The issue has indentified a bug
Milestone

Comments

@CherryDT
Copy link

CherryDT commented Apr 13, 2018

I encountered a weird error, that was initially puzzling: I got an output which should have been impossible when looking at the expression I wrote in a mustache block. Only when I looked at it in the debugger, I noticed that edge.js was transforming my expression to something it was never intended to be, messing up its logic.

Code in template:

{{line.lineName + ((user.line.id === line.id) ? ' (current)' : (' (' + (line.user.username || 'unselected') + ')'))}}

Compiled code:

out += `            ${this.context.escape(this.context.accessChild(this.context.resolve('line'), ['lineName']) + this.context.accessChild(this.context.resolve('user'), ['line','id']) === this.context.accessChild(this.context.resolve('line'), ['id']) ? ' (current)' : (' (' + this.context.accessChild(this.context.resolve('line'), ['user','username']) || 'unselected') + ')')}\n`

Values:

line.id === 1
line.lineName === 'aaa'
line.user === null
user.line === null

Expected result: aaa (unselected)
Actual result: (undefined) << that is a space and then "(undefined)"

I used edge.js 1.1.3 from @adonisjs/framework 4.0.28

@thetutlage
Copy link
Member

You must use presenters for this complex syntax.

@CherryDT
Copy link
Author

CherryDT commented Apr 14, 2018

I'm sorry but that is not an acceptable answer - because, how does one know what syntax is "too complex"? Either Edge can parse JS syntax or it cannot. If it can, it should work in 100% of the cases (because it would be easy to get bugs this way that are not immediately detected). If it cannot, it shouldn't pretend that it can, and it should have the restrictions written somewhere and throw an error if an expression is not parsable.

My point is simply, silently changing logic in a way that is possibly hard to detect and debug is not an acceptable failure mode. It is simply buggy behavior and can ruin our day, and that's something that should be fixed (or rejected with an error).

@thetutlage
Copy link
Member

Mind sharing a PR for same?

@CherryDT
Copy link
Author

CherryDT commented Apr 14, 2018

I see what you are saying, if I'm critizing, I'm welcome to fix it ;) Well yes but I don't have that understanding you have, so it's unlikely I'll have a PR anytime soon.

What I'm saying is simply that "use a presenter" is not a solution - if you mean it's a hard problem for you as well and you can't fix it anytime soon, then I would have hoped that you would maybe add a warning to the docs or something like this, pointing out which things are not supported. For me the problem is that it now feels like "for anything other than {{ someVariable }} I'll need a presenter because otherwise I can never be sure my code does what I designed it to do", because there is no rule to say what needs a presenter and what doesn't, right? :|

So, my point is that it sounded like you were trying to say "it's as designed, not a bug", when it's clearly unexpected behavior and cannot be predicted, and that's what I was critizing. It's clear to me that you can't fix every issue right away, don't get me wrong!

@thetutlage
Copy link
Member

Yup seems like the one liner gave the wrong message.

The AST generation layer of Edge needs some work to work with any Javascript expression, so more involved expressions produces these weird behaviour.

I’ll add it to the docs and will see, when I can resume my work on Edge to address these issues. Currently got too much to handle.

If you are interested, then my plans is to remove Esprima and use Babel AST generator, since the root of these bugs is esprima not giving enough information

@CherryDT
Copy link
Author

CherryDT commented Apr 14, 2018

A quick fix seems to be wrapping about everything in parens. However this creates ugly code (and I'm not sure whether it creates new problems on the way):

4. convert nested binary expression
  expected '((((2) + (2))) === (4))' to equal '2 + 2 === 4'
  "((((2) + (2))) === (4))" => "2 + 2 === 4"

I have to say I don't understand the logic of this parser because the decision whether to wrap things in parens doesn't really check operator precedence, it seems like it checks only for very specific cases.

@CherryDT
Copy link
Author

CherryDT commented Apr 14, 2018

My two cents: In general it seems to me that it's not such a good idea to do the "convert back to code" part yourself. I would separate the part where you replace member access with access functions from the code generations.

What I mean: I would convert the expression to an AST using acorn, then mutate the AST (replacing things), then later convert it back to code using astring. This sounds less error-prone and also simpler to me - what do you think? Maybe I just don't get the full picture yet - I'm trying to understand why you even have to care about things other than identifiers, member accesses and calls.

(By the way: Your list of arithmetic operators is missing **, I think.)

@thetutlage
Copy link
Member

thetutlage commented Apr 14, 2018

What I mean: I would convert the expression to an AST using acorn

Acorn, doesn't return the proper information about the parentheses.

Your list of arithmetic operators is missing **, I think

Feel free to create a PR for same.

then later convert it back to code using astring

Not sure how flexible astring is, but Edge cannot parse it back to strict Javascript. It has to be lenient with nulls and undefined. For example: In your case, the line.user was null but still line.user.username didn't throw the exception.

@CherryDT
Copy link
Author

CherryDT commented Apr 14, 2018

Power operator: #44

Your accessFn handles the null/undefined thing, right?
But maybe I didn't express myself well.

My general idea was along the lines of:

  1. You convert a.b + 3 to an AST
  2. You modify that AST so that it represents this.context.accessChild(this.context.resolve('a'), ['b']) + 3
  3. You convert the modified AST back to code

@CherryDT
Copy link
Author

CherryDT commented Apr 14, 2018

Proof of concept here: https://repl.it/@CherryDT/EdgeAstModPoC
It takes code and transforms it to use this.context.resolve, this.context.accessChild and this.context.callFn without altering the logic.

transformCode("line.lineName + ((user.line.id === line.id) ? ' (current)' : (' (' + (line.user.username || 'unselected') + ')'))")

gives:

this.context.accessChild(this.context.resolve("line"), ["lineName"]) + (this.context.accessChild(this.context.resolve("user"), ["line", "id"]) === this.context.accessChild(this.context.resolve("line"), ["id"]) ? ' (current)' : ' (' + (this.context.accessChild(this.context.resolve("line"), ["user", "username"]) || 'unselected') + ')');

...which is exactly what it should be.

And all the heavy lifting is moved to acorn/astring, your code will become a lot simpler that way, I guess.

(I'm not sure what you meant regarding acorn not returning parens... It works fine for me. The parens are not needed as separate piece of information, the tree already encapsulates that info. In fact, that means that unnecessary parens (and only those) will be gone at the end, which is actually a good thing.)

However, I can't put this into a PR because I don't understand enough about the big picture of your parser, because it seems like it is also used for contents of tags and things like that. But I hope this proof-of-concept brought the idea across.

EDIT: Added support for callFn in the PoC.

@thetutlage
Copy link
Member

The parens are not needed as separate piece of information

They may not be needed in some cases, but completely required when doing mathematical calculations. For example

2 * (3 + 2 * 4)

// and this
2 * 3 + 2 * 4

Above both statements will return different output, based upon the parens in place

Also I am using Esprima and not Acorn, so things maybe different.

I liked the idea of using astring and happy to use it.

If you think, that you can replace Esprima with Acorn and astring then it will be nice, otherwise I will work on it once done with all other tasks.

@thetutlage
Copy link
Member

Also when using https://astexplorer.net, you can see that babylon is more accurate than acorn.

Acorn

acorn

Babylon

babylon

@thetutlage
Copy link
Member

Also I just tried acorn on my machine and their is an option to preserve parentheses called preserveParens, so acorn seems to be a better fit in that case, since babylon is tailored for JSX and flow and we don't need that in Edge.

In short, if we can start by replacing Esprima with Acorn and retain parentheses then it will be a good starting point.

Later I want to optimize Edge AST parser too, which detects mustache and @tags, but that is a 2nd step

@CherryDT
Copy link
Author

CherryDT commented Apr 15, 2018

I think there is still some kind of misunderstanding here regarding the parens. In short: You don't need to preserve parens as such. Because the position of things in the tree has all the information needed. Everything else (such as where unnecessary parens were) are cosmetic information which is entirely superfluous in your use case. Sure, you won't know whether an expression was in parens or not when you look only at the node in the AST alone, but you don't have to - the code generator will know because it sees the whole tree and not just the node alone!

In a way, it's like saying "but we need to preserve the information whether single quotes or double quotes were used, because the escaping will be different, e.g. "hasn't" works with double quotes but needs escaping 'hasn\'t' in single quotes" - that's "thinking too small" because the parsed form of the code has the value of the string, and that is the same in both cases, namely hasn't. Whether to put single quotes around it (and thereby escape it) or double quotes is a decision of the code generator at the end which makes no difference functionality-wise - and it's the same here.

david@CHE-X1:/mnt/c/Users/david/proj/edge-ast-mod-poc $ node
> const { parse } = require('acorn')
undefined
> const { generate } = require('astring')
undefined
> generate(parse('((customerCount / usersCount) * 100)'))
'customerCount / usersCount * 100;\n'
> generate(parse('2 * (3 + 2 * 4)'))
'2 * (3 + 2 * 4);\n'
>

You are right that the parens are by default gone in ((customerCount / usersCount) * 100), but that's actually good because they were never needed in the first place! That's because / and * are same priority and evaluated from left to right, therefore, (x / y) * z == x / y * z much like (x * y) * z == x * y * z. And the outer parens are also gone because they are always meaningless. (If you now say: Hah, wait, they are not meaningless once it's part of another expression, then - yes, true, but then they will appear the moment you add them into the other expression - "being wrapped" is not a property of the inner expression, it's part of the construction of code out of a tree when the default operator precedence would not yield the desired result.)

For the other example, where the parens actually make a difference, you can see that they are kept.

That's because the tree is all you need:

- Binary *
  LHS:
    - Binary /
      LHS:
        - Identifier customerCount
      RHS:
        - Identifier usersCount
  RHS:
    - Literal 100

This is a perfectly good representation of both ((customerCount / usersCount) * 100) and customerCount / usersCount * 100, logic-wise, since they behave the same. And without the preserve-parens option, you will get the latter, which, again, I think is a good thing, because it makes the generated code leaner (doesn't carry user-induced clutter over to the generated code).

As for the other example:

- Binary *
  LHS:
    - Literal 2
  RHS:
    - Binary +
    LHS:
      - Literal 3
    RHS:
      - Binary *
        LHS:
          - Literal 2
        RHS:
          - Literal 4

And this represents 2 * (3 + 2 * 4) without ever mentioning the word "parenthesis".

Note that 2 * 3 + 2 * 4 would be simply a different tree:

- Binary +
  LHS:
    - Binary *
      LHS:
        - Literal 2
      RHS:
        - Literal 3
  RHS:
    - Binary *
      LHS:
        - Literal 2
      RHS:
        - Literal 4

As you can see, the operations happen in a different order, i.e. the literals are in different places in the tree. This is enough information to accurately represent an expression with all its logic (and only that).

Btw I tried a proof of concept of this inside Edge, by monkey-patching parseRaw to override the toStatement method of the returned statement/expression to instead return such a transformed string, and this works great for everything inside mustaches (I uncommented the async test filter, so that the functional tests ran, and added my tests cases there), but everything else failed, partly because of subtle differences in the code (e.g. quote style) which made unit tests fail, but also, more problematic, because other parts of the code which handle tags seems to not like what I'm doing (thing like addValue got empty parameters and other ugly things).

@thetutlage
Copy link
Member

If I take the below statement to AST explorer, I don't see where the information regarding parens is persisted.

2 * (3 + 2 * 4)

https://astexplorer.net/#/gist/bf2aa1d3e427d5741bab91d33e556b4e/67a1d25d8170baaad9ba37515c51e860bc0bdb25

@thetutlage
Copy link
Member

Okay, so the different tree thing does makes sense.

@thetutlage
Copy link
Member

I am happy accepting a PR that does replace Esprima and custom AST -> Expression with acorn and astring.

@thetutlage thetutlage added this to the 2.0 milestone May 23, 2018
@thetutlage thetutlage added the Type: Bug The issue has indentified a bug label May 23, 2018
@thetutlage thetutlage self-assigned this May 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug The issue has indentified a bug
Projects
None yet
Development

No branches or pull requests

2 participants