Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code action to convert string literal #2097

Merged
merged 10 commits into from
Dec 5, 2024

Conversation

Sekky61
Copy link
Contributor

@Sekky61 Sekky61 commented Nov 30, 2024

Code action to rewrite string literal <-> multiline string

The code action will be available if the cursor is inside the string literal and if it is convertible.

Code action name in both directions: refactor.convertStringLiteral.

Tested in: Neovim, VS Code

Converting string literal -> multiline string

Multiline string literals have no escapes, so strings with some escapes cannot be converted:

Escape Sequence Name Convertible to multiline
\n Newline
\r Carriage Return
\t Tab
\\ Backslash
\' Single Quote
\" Double Quote
\xNN hexadecimal 8-bit byte value (2 digits)
\u{NNNNNN} hexadecimal Unicode scalar value UTF-8 encoded (1 or more digits)

It is possible to convert at least some of the hex values to the utf-8 codes.

string literal -> multiline string example
const foo = "Hello\nWorld";
const foo = 
\\Hello
\\World
;

Converting multiline string -> string literal

Escapes must be added in some cases.
Newline \n must be added for every line of the multiline string except for the last one.

From ascii select characters can be found in the string, which makes things easier.
Quoting the tokenizer for multiline string:

    0x01...0x09, 0x0b...0x0c, 0x0e...0x1f, 0x7f => continue :state .invalid,
multiline string -> string literal
const foo = 
\\1. Eggs
\\2. "Bacon"
\\
;
const foo = "1. Eggs\n2. \"Bacon\"\n";

Discussion

The code action does not appear when cursor is on the markup (\\ or ").
I would like to fix this because this way it is not possible to use the action on empty strings.
It is also less convenient for the user.
The position context might be updated to not exclude the markup, but there might be some consequences.

Other notes:

Questions:

  • Did I understand \r correctly?
  • Should indentation be part of the produced edit, or does fmt take care of it afterwards? Right now there is a hardcoded 4 space indent
  • Performance-wise, there might be a lot of appendSlice calls in a loop. An upper bound of the final string's length could be determined and preallocated.

Copy link
Member

@Techatrix Techatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code action does not appear when cursor is on the markup (\\ or ").
I would like to fix this because this way it is not possible to use the action on empty strings.
It is also less convenient for the user.
The position context might be updated to not exclude the markup, but there might be some consequences.

Looking at all uses of Analyser.PositionContext, the only place where Analyser.PositionContext.string_literal is being used is for filesystem completions inside string literals in build.zig files. #1668
As long as that continues to work, you are free to adjust getPositionContext to return .string_literal as needed.

  • Did I understand \r correctly?

You should take a look at ziglang/zig-spec#38. TLDR a multi-line string literal can't contain carriage returns.

Should indentation be part of the produced edit, or does fmt take care of it afterwards? Right now there is a hardcoded 4 space indent

Usually the user will have to invoke formatting manually after applying the code action (usually with format on save). Ideally we would produce code that is already valid to zig fmt but it isn't easy to cover all cases.


I would suggest to reject string literals that contain \xNN or \u{NNNNNN} escape sequences. They can be used to produce invalid UTF-8:

const foo = "\xaa";

I tested this in VS Code and it showed the "convert to a multiline string literal" code action but applying it did nothing.

src/features/code_actions.zig Outdated Show resolved Hide resolved
tests/lsp_features/code_actions.zig Outdated Show resolved Hide resolved
src/features/code_actions.zig Outdated Show resolved Hide resolved
src/features/code_actions.zig Outdated Show resolved Hide resolved
src/features/code_actions.zig Outdated Show resolved Hide resolved
Copy link
Member

@Techatrix Techatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also add a if (!builder.wantKind(.{ .custom_value = "refactor.convertStringLiteral" })) return; check. A tracy zone for each function would also be nice.

See #2087

@Sekky61
Copy link
Contributor Author

Sekky61 commented Dec 2, 2024

Thanks for the feedback. I deviated in one point: I added a utf8ValidateSlice which I think could catch most of the conversion problems.

Copy link
Member

@Techatrix Techatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed up to c129bd3 so that the remaining test cases work as well. This means that the string literal code action should work in the following situation:

const foo = "hello world"<cursor>;

There was also one change missing where carriage returns should be rejected when converting to multi-line string literals. I made the change myself.

I am currently running some fuzzing take make sure that no regressions have been introduced to the position context. If that doesn't report any issue, this is ready to be merged.

Copy link
Member

@Techatrix Techatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@Techatrix Techatrix merged commit 70f9e7c into zigtools:master Dec 5, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants