Restructure rustc_lexer::unescape #136538

hkBst · 2025-02-04T13:31:31Z

Separate the functions for unescaping each kind of string and unit:

this duplicates some code, but also gets rid of code that is only there for genericity
each function is now simpler by inlining booleans, which might lead to faster code

Use a Peekable<CharIndices<'_>> instead of going back and forth between string slice and chars iterator.

this gets rid of most position computations
allows removal of double traversal for correct backslash newline escapes in skip_ascii_whitespace

Improves documentation

rustbot · 2025-02-04T13:31:41Z

rustbot has assigned @petrochenkov.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

petrochenkov · 2025-02-05T14:08:18Z

Every year someone restructures this code to match their taste.
I'll just send this to the author of the previous restructuring.
r? @nnethercote

rustbot · 2025-02-05T14:08:20Z

Could not assign reviewer from: nnethercote.
User(s) nnethercote are either the PR author, already assigned, or on vacation. Please use r? to specify someone else to assign.

hkBst · 2025-02-05T14:19:57Z

Every year someone restructures this code to match their taste.

Haha :) that is ominous. I am hoping to see some small perf wins to show good taste, and I am seeing a chance for perhaps adding some structure to the errors.

Separate the functions for unescaping each kind of string and unit: - this duplicates some code, but also gets rid of code that is only there for genericity - each function is now simpler by inlining booleans, which might lead to faster code Use a Peekable<CharIndices<'_>> instead of going back and forth between string slice and chars iterator. - this gets rid of most position computations - allows removal of double traversal for correct backslash newline escapes in skip_ascii_whitespace Improves documentation

rustbot · 2025-02-07T16:23:20Z

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

nnethercote · 2025-02-09T21:24:08Z

@hkBst: is this ready for a perf run and then review?

hkBst · 2025-02-10T08:15:14Z

@nnethercote yes, sorry for not stating that explicitly.

nnethercote · 2025-02-10T08:40:08Z

@bors try @rust-timer queue

Restructure rustc_lexer::unescape Separate the functions for unescaping each kind of string and unit: - this duplicates some code, but also gets rid of code that is only there for genericity - each function is now simpler by inlining booleans, which might lead to faster code Use a Peekable<CharIndices<'_>> instead of going back and forth between string slice and chars iterator. - this gets rid of most position computations - allows removal of double traversal for correct backslash newline escapes in skip_ascii_whitespace Improves documentation

bors · 2025-02-10T08:41:23Z

⌛ Trying commit e00522b with merge 75f0e3d...

bors · 2025-02-10T10:36:47Z

☀️ Try build successful - checks-actions
Build commit: 75f0e3d (75f0e3d1b58cefe7a6f6dbbe979b8b0363251914)

rust-timer · 2025-02-10T11:52:26Z

Finished benchmarking commit (75f0e3d): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.6%]	16
Regressions ❌ (secondary)	1.3%	[0.2%, 2.7%]	20
Improvements ✅ (primary)	-2.6%	[-2.6%, -2.6%]	1
Improvements ✅ (secondary)	-0.3%	[-0.3%, -0.3%]	2
All ❌✅ (primary)	0.1%	[-2.6%, 0.6%]	17

Max RSS (memory usage)

Results (primary 3.5%, secondary 2.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	5.2%	[2.9%, 7.7%]	3
Regressions ❌ (secondary)	2.6%	[2.1%, 3.2%]	6
Improvements ✅ (primary)	-1.5%	[-1.5%, -1.5%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	3.5%	[-1.5%, 7.7%]	4

Cycles

Results (primary -2.7%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.7%	[2.6%, 2.8%]	2
Improvements ✅ (primary)	-2.7%	[-2.7%, -2.7%]	1
Improvements ✅ (secondary)	-2.5%	[-2.6%, -2.3%]	2
All ❌✅ (primary)	-2.7%	[-2.7%, -2.7%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 781.62s -> 780.755s (-0.11%)
Artifact size: 348.30 MiB -> 348.30 MiB (0.00%)

nnethercote · 2025-02-11T22:29:44Z

Thanks for the PR! I appreciate that this code is very finicky, and I understand the desire to improve it. But there are some issue with the PR as it stands.

The numbers of lines of code has increased.
As you say, there is a lot of duplication. unescape_{str,byte_str,c_str} is a good example: three moderately complex 22 line functions that each differ by a single line. This gives a high potential for inconsistencies and bugs to creep in, in the future. unescape_{char,byte}_iter is a similar case. There is also a lot of comment duplication or almost-duplication.
Performance is worse. The syn-1.0.89 opt result is a red herring; that benchmark is very noisy. The regressions on coercions are the real story here. That benchmark is a genuine stress test for string literal handling. This code has been carefully tuned for maximum perf (e.g. see the "Force-inlining here is aggressive" comment) and even minor changes can have surprising effects.

Also, you've made a bunch of changes in a single commit. Because this code is so intricate, it's really hard to read the diff. I suspect it could be broken into several atomic commits, each one doing a single distinct change. That would be much easier to review. I won't suggest doing that for the whole PR, because I don't think it's acceptable for the reasons I gave above, sorry! However, I wonder if a subset of the changes might still be worthwhile, such as the Peekable<CharIndices<'_>> changes. Perhaps you can redo a subset of the changes in a new branch, the parts that don't cause so much code duplication, perhaps in multiple smaller commits. That might lead to a less invasive PR that still has some benefit.

hkBst · 2025-02-12T09:04:45Z

@nnethercote thanks for reviewing and interpreting the perf results (I wasn't quite sure how to interpret them).

I realize the diff is quite useless as this is almost a complete code replacement, so thanks again for your efforts. I really appreciate them. I think this happened mostly because I started out duplicating some code to make it more concrete, so I could slowly simplify parts until I could understand it better, but I am happy to rework history if/once I find a useful improvement. In that respect it is funny that you shold mention being interested in the Peekable changes as I had started to suspect that part of negatively impacting performance. Maybe if I can isolate that we can try and see...

I have some more ideas for this code. In particular I think it would be much better if the functions taking a callback turned into iterators, such that it is easier to consume them. This may have further unpredictable perf results, but iterators are supposed to be fast right? Or to put it another way: I don't have a good sense how well callback-based code optimizes, but I can imagine that it could be opaque to the compiler resulting in missed opportunities.

I'd be happy to hear from you, if you have any more thoughts on this.

hkBst · 2025-02-12T09:13:55Z

Oh, one more thing I wanted to mention. I think the similarities between unescaping bytes, chars, and C strings are definitely there, but the current code also compromises to squeeze them together, such as the bytes functions returning chars instead of bytes, the MixedUnit allowing nul bytes (might be acceptable if it was used for something other than just C strings, in fact I'm currently thinking of this struct as CChar), and the uses of unreachable.

hkBst · 2025-02-12T10:43:37Z

See #136919 for a PR that introduces just Peekable, which almost necessitates using CharIndices instead of Chars, since you lose as_str to determine position.

hkBst · 2025-02-12T16:09:26Z

See #136931 for a PR that makes a start at exploring moving to iterators instead of callbacks.

nnethercote · 2025-02-12T21:54:37Z

Thanks for filing the new PRs. #136947 puts me back in the review rotation, once it is merged you'll be able to request reviews from me again. I'll wait until you've finished drafting the new PRs before taking a look, and I'll close this PR.

hkBst · 2025-03-05T18:31:40Z

I'd like to take another stab at this, using a macro to remove the code duplication, and removing the use of Peekable and perhaps other things that I now suspect of causing perf regressions. Should I file a new PR or do you want to reopen this one?

nnethercote · 2025-03-07T11:17:44Z

A new PR makes sense. That way the old approach doesn't get mixed up with the new approach.

rustbot assigned petrochenkov Feb 4, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 4, 2025