Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

- lexer.rl: handle CLRF as a line separator #1022

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

iliabylich
Copy link
Collaborator

@iliabylich iliabylich commented Jun 8, 2024

Closes #1020.

3.3.1 :004 > Parser::CurrentRuby.parse("1\r\n2\r\n3").children[2].loc
 => #<Parser::Source::Map::Operator:0x00000001222f0f80 @expression=#<Parser::Source::Range (string) 6...7>, @node=s(:int, 3), @operator=nil>

3.3.1 :005 > Parser::CurrentRuby.parse("1\r\n2\r\n3").children[2].loc.expression.source
 => "3"

A few notes:

  1. If \r\n is a line separator parser still emits tNL token with location of the \n character
  2. tSTRING_CONTENT tokens now have proper locations, but the content doesn't include \r part of \r\n (because eval(%{"\r\n"}) is just "\n"), so .source of their locations doesn't match string content. I guess it's fine, the same happens with all escape sequences anyway.

If it doesn't break Rubocop's test suite I guess it's safe to merge it as is.

@kddnewton Could you take a look at this please? Does it fix Prism's translator?

@kddnewton
Copy link

This gets close, but runs into issues with escaped \r and literal \n then getting grouped, as in:

<<EOS
foo\rbar
baz\r
EOS

(There are regular newline characters after each line.) In this PR, it groups that last \r\n because the gsub is happening after escape sequences are resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Offsets with \r\n in source
2 participants