Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to process win32 line break correctly #146

Closed
3cp opened this issue Oct 28, 2020 · 15 comments · Fixed by #150
Closed

Need to process win32 line break correctly #146

3cp opened this issue Oct 28, 2020 · 15 comments · Fixed by #150

Comments

@3cp
Copy link
Member

3cp commented Oct 28, 2020

I noticed we skipped lots of test files in test262-parser-tests, I tried to unblock them to see what happens.

For example the first skipped file:

The 110fa1efdd0868b8.js file contents is

"Hello\
world"

But the file is using win32 line break \r\n which causes issue in meriyah.

@3cp
Copy link
Member Author

3cp commented Oct 28, 2020

It seems our code didn't handle LineContinuation inside string literal at all, the successful pass on \ + \n was an accident.

@KFlash
Copy link
Contributor

KFlash commented Oct 28, 2020

strange. Can you do something like this

It's just skipped here, but according to the specs only CR and LF are invalid char in string literals now. For PS and LS I'm not 100% sure if the column value should be increased?

Similar to what you see here

@3cp
Copy link
Member Author

3cp commented Oct 28, 2020

I did not see you handle LineContinuation in that escaya code either.

@KFlash
Copy link
Contributor

KFlash commented Oct 28, 2020

A LineContinuation is \r, so what's done is to pass all chars starting with \ to escaped chars. Then increase the index. If no "switch match" then return String.fromCharCode(ch); and you get your empty code point

@3cp
Copy link
Member Author

3cp commented Oct 28, 2020

Thx, so it's not an accident. Just a missing case on \r\n.

BTW, it seems test262-parser-tests are being maintained poorly, many unanswered issues.

@3cp
Copy link
Member Author

3cp commented Oct 28, 2020

The LineContinuation supports \n, \r, \r\n, and unicode LS and PS.

http://www.ecma-international.org/ecma-262/11.0/index.html#prod-LineTerminatorSequence

@KFlash
Copy link
Contributor

KFlash commented Oct 28, 2020

test262-parser-tests is no good. Either try the real test suite or look into https://github.com/v8/v8/blob/master/test/cctest/test-parsing.cc for some edge cases etc

LS and PS are now allowed in string literals, but \n, \r, and \r\n are dissallowed.

If you read what I wrote in my last comment on how to grab the empty code point, it can be done the same for the sequence \r\n if you slice and append the string

\r - res += res + slice...
\n - res += res + slice...

The tricky part is for this cases \\r and \\n

@KFlash
Copy link
Contributor

KFlash commented Oct 28, 2020

@KFlash
Copy link
Contributor

KFlash commented Oct 29, 2020

I'm trying to give some "super powers" to the gist I linked too so this can be solved once for all. Give me 30 min.

@KFlash
Copy link
Contributor

KFlash commented Oct 29, 2020

Done! https://gist.github.com/KFlash/a8452b70d3596e0f79a2b8d9b3288047

Taken from the top of my head, so didn't run any tests against this, but it should work

@KFlash
Copy link
Contributor

KFlash commented Oct 29, 2020

@3cp In my gist the cases should be ordered 0, 1, 2 , and you can index++ - skip the char - before the loop if you update the backslash consumer. I had no time to look into that.

@KFlash
Copy link
Contributor

KFlash commented Oct 29, 2020

Updated the gist and added escape char scanning. I tweaked it, but you still need to "adjust". I think this should solve this issue

@3cp
Copy link
Member Author

3cp commented Oct 29, 2020

Any reason you write these?

export function scanString(parser: ParserState, cp: number): Token {
  parser.pos++;
  let quote = cp;

Instead of keeping existing param name quote?

export function scanString(parser: ParserState, quote: number): Token {
  parser.pos++;
  let cp = quote;

@KFlash
Copy link
Contributor

KFlash commented Oct 29, 2020

No reason. I just tried to avoid assigning of the variable inside the loop. Costs memory. It was all done fast and without too much thinking.

@3cp
Copy link
Member Author

3cp commented Oct 30, 2020

It turns out a very small bug in existing code, it was sitting there because of missing test coverage.
Let's just patch it for now, worry about the bigger refactoring later.

@3cp 3cp closed this as completed in #150 Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants