Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frozen strings and character literal syntax #342

Open
viralpraxis opened this issue Dec 11, 2024 · 6 comments
Open

Frozen strings and character literal syntax #342

viralpraxis opened this issue Dec 11, 2024 · 6 comments

Comments

@viralpraxis
Copy link

I'm trying to figure out why

RuboCop::AST::ProcessedSource.new('"\n"', 3.3).ast.value.frozen?
=> false

but

RuboCop::AST::ProcessedSource.new('?\n', 3.3).ast.value.frozen?
=> true

Is this intentional? Both these expressions have the same AST (str "\n")

@marcandre
Copy link
Contributor

Hmmm, I don't think that would be intentional and think it would be best frozen.

Is that using parser or prism?

@viralpraxis
Copy link
Author

RuboCop::AST::ProcessedSource.new('?\n', 3.3, parser_engine: "parser_prism").ast.value.frozen?
=> false
RuboCop::AST::ProcessedSource.new('?\n', 3.3, parser_engine: "parser_whitequark").ast.value.frozen?
=> true

@marcandre
Copy link
Contributor

AFAIK, rubocop-ast does not process the arguments of the nodes, so I'd open an issue with the parser (Prism only?) which should, imo, freeze these at parsing.

@kddnewton
Copy link

kddnewton commented Dec 17, 2024

I'm not sure I understand what is intentional and what is not here.

irb(main):012> RuboCop::AST::ProcessedSource.new('?\n', 3.3, parser_engine: "parser_whitequark").ast.value.frozen?
=> true
irb(main):013> RuboCop::AST::ProcessedSource.new('"\n"', 3.3, parser_engine: "parser_whitequark").ast.value.frozen?
=> false

Is it just character literals? It looks like most string literals are unfrozen coming back from the parser gem:

irb(main):016> Parser::CurrentRuby.parse('?\n').children[0].frozen?
=> true
irb(main):017> Parser::CurrentRuby.parse('"\n"').children[0].frozen?
=> false

@kddnewton
Copy link

Funny enough, it looks like it's not even consistent within character literals:

irb(main):019> Parser::CurrentRuby.parse('?a').children[0].frozen?
=> false

@kddnewton
Copy link

Okay, further investigation reveals that it depends on if the parser gem determines something is string content or if something is a token. Tokens are frozen, string content isn't. It seems to be relatively arbitrary. I'm okay freezing strings in the prism AST in general, but matching this behavior feels a bit odd.

viralpraxis added a commit to viralpraxis/rubocop-performance that referenced this issue Dec 18, 2024
Since sometimes AST string nodes might have their values frozen,
we should call `dup` explicitly to make sure `to_string_literal`
(which does not work with frozen strings) does not raise. Maybe
we should also apply a patch on RuboCop's core side to fix the
`to_string_literal` itself (it now uses `force_encoding` on its argument).

See [1] and [2] for details.

[1] ruby/prism#3309
[2] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop that referenced this issue Dec 19, 2024
Currently, there are exactly three instances where the receiver-mutating
`String#force_encoding`` method is used. In all cases except for
`to_string_literal`, a `String#dup` call is included to ensure compatibility
with frozen strings. I couldn't identify any counterexamples in rubocop core that would result
in a runtime error, but there is at least one instance in `rubocop-performance` [1]
where a frozen string might be passed. To ensure everything works correctly,
we need to add a `#dup` call in the implementation of `to_string_literal`.

For additional context, it's worth noting that the Prism parser might start
freezing AST leaf nodes in the future [2],
and whitequark's parser has some unpredictability in this regard [3].

[1] rubocop/rubocop-performance#480
[2] ruby/prism#3309
[3] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop that referenced this issue Dec 19, 2024
Currently, there are exactly three instances where the receiver-mutating
`String#force_encoding` method is used. In all cases except for
`to_string_literal`, a `String#dup` call is included to ensure compatibility
with frozen strings. I couldn't identify any counterexamples in rubocop core that would result
in a runtime error, but there is at least one instance in `rubocop-performance` [1]
where a frozen string might be passed. To ensure everything works correctly,
we need to add a `#dup` call in the implementation of `to_string_literal`.

For additional context, it's worth noting that the Prism parser might start
freezing AST leaf nodes in the future [2],
and whitequark's parser has some unpredictability in this regard [3].

[1] rubocop/rubocop-performance#480
[2] ruby/prism#3309
[3] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop that referenced this issue Dec 20, 2024
Currently, there are exactly three instances where the receiver-mutating
`String#force_encoding` method is used. In all cases except for
`to_string_literal`, a `String#dup` call is included to ensure compatibility
with frozen strings. I couldn't identify any counterexamples in rubocop core that would result
in a runtime error, but there is at least one instance in `rubocop-performance` [1]
where a frozen string might be passed. To ensure everything works correctly,
we need to add a `#dup` call in the implementation of `to_string_literal`.

For additional context, it's worth noting that the Prism parser might start
freezing AST leaf nodes in the future [2],
and whitequark's parser has some unpredictability in this regard [3].

[1] rubocop/rubocop-performance#480
[2] ruby/prism#3309
[3] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop-performance that referenced this issue Dec 21, 2024
Since sometimes AST string nodes might have their values frozen,
we should call `dup` explicitly to make sure `to_string_literal`
(which does not work with frozen strings) does not raise. Maybe
we should also apply a patch on RuboCop's core side to fix the
`to_string_literal` itself (it now uses `force_encoding` on its argument).

See [1] and [2] for details.

[1] ruby/prism#3309
[2] rubocop/rubocop-ast#342
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants