Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix problems with assoc expr token collection #128725

Merged
merged 6 commits into from
Aug 16, 2024

Conversation

nnethercote
Copy link
Contributor

There are several cases involving assoc exprs and attributes where the current code does the wrong thing. This PR adds some tests that demonstrate the problems and then fixes them.

r? @petrochenkov

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 6, 2024
@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 6, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 6, 2024
…roblems, r=<try>

Fix problems with assoc expr token collection

There are several cases involving assoc exprs and attributes where the current code does the wrong thing. This PR adds some tests that demonstrate the problems and then fixes them.

r? `@petrochenkov`
@bors
Copy link
Contributor

bors commented Aug 6, 2024

⌛ Trying commit ce7eb36 with merge 3a415f3...

@programmerjake
Copy link
Member

would you be willing to add some tests that you can invoke attribute proc macros where the input has attributes on expressions without needing nightly features? (I wasn't able to find a test for that, and that's already stable behavior that the library I'm writing depends on).

e.g. something like

// no nightly features needed
#[check_input_tokens]
fn f() {
    #[a] a + #[b] b + #[c] c;
}

@bors
Copy link
Contributor

bors commented Aug 6, 2024

☀️ Try build successful - checks-actions
Build commit: 3a415f3 (3a415f36e85767d18e3218034ba78a9807c72eba)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (3a415f3): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.3%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.3% [-0.4%, -0.3%] 4
All ❌✅ (primary) 0.2% [0.2%, 0.3%] 3

Max RSS (memory usage)

Results (primary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.7% [0.4%, 3.0%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.5% [-2.0%, -0.9%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-2.0%, 3.0%] 4

Cycles

Results (primary -2.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.0% [-2.0%, -2.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.0% [-2.0%, -2.0%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 760.794s -> 761.826s (0.14%)
Artifact size: 336.95 MiB -> 336.97 MiB (0.00%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Aug 6, 2024
compiler/rustc_parse/src/parser/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/expr.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/expr.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/expr.rs Show resolved Hide resolved
@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 6, 2024
@nnethercote
Copy link
Contributor Author

Performance effects are negligible, which matches what I saw locally. This is good because this change does increase the amount of TokenCursor snapshot-taking.

@nnethercote nnethercote force-pushed the fix-assoc-expr-collect-problems branch from ce7eb36 to 11ef8a4 Compare August 7, 2024 05:46
@nnethercote
Copy link
Contributor Author

I have addressed the comments. I also tweaked the stringify.rs test to use operators other than just +, including .. and as.

@petrochenkov
Copy link
Contributor

Let's do a crater run (to check if something like #128725 (comment) depends on the current behavior, in particular).
@bors try

@petrochenkov petrochenkov added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 7, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 7, 2024
…roblems, r=<try>

Fix problems with assoc expr token collection

There are several cases involving assoc exprs and attributes where the current code does the wrong thing. This PR adds some tests that demonstrate the problems and then fixes them.

r? `@petrochenkov`
@bors
Copy link
Contributor

bors commented Aug 7, 2024

⌛ Trying commit 11ef8a4 with merge 9c7f973...

@bors
Copy link
Contributor

bors commented Aug 7, 2024

☀️ Try build successful - checks-actions
Build commit: 9c7f973 (9c7f973c4f483238f1484927772d0446e5686cc0)

@petrochenkov
Copy link
Contributor

@craterbot check

@craterbot
Copy link
Collaborator

👌 Experiment pr-128725 created and queued.
🤖 Automatically detected try build 9c7f973
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🚧 Experiment pr-128725 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🎉 Experiment pr-128725 is completed!
📊 5 regressed and 1 fixed (496932 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 15, 2024
@bors
Copy link
Contributor

bors commented Aug 15, 2024

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

  1. git checkout fix-assoc-expr-collect-problems (switch to your branch)
  2. git fetch upstream master (retrieve the latest master)
  3. git rebase upstream/master -p (rebase on top of it)
  4. Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
  5. git push self fix-assoc-expr-collect-problems --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message
Auto-merging compiler/rustc_parse/src/parser/stmt.rs
CONFLICT (content): Merge conflict in compiler/rustc_parse/src/parser/stmt.rs
Auto-merging compiler/rustc_parse/src/parser/path.rs
Auto-merging compiler/rustc_parse/src/parser/pat.rs
Auto-merging compiler/rustc_parse/src/parser/mod.rs
Auto-merging compiler/rustc_parse/src/parser/item.rs
Auto-merging compiler/rustc_parse/src/parser/generics.rs
Auto-merging compiler/rustc_parse/src/parser/expr.rs
CONFLICT (content): Merge conflict in compiler/rustc_parse/src/parser/expr.rs
Auto-merging compiler/rustc_parse/src/parser/diagnostics.rs
Auto-merging compiler/rustc_parse/src/parser/attr.rs
Automatic merge failed; fix conflicts and then commit the result.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Aug 15, 2024
It's not an important type when it comes to memory use.
This pre-existing type is suitable for use with the return value of the
`f` parameter in `collect_tokens_trailing_token`. The more descriptive
name will be useful because the next commit will add another boolean
value to the return value of `f`.
A couple of these are marked `FIXME` because they demonstrate existing
bugs with token collection.
This commit does the following.

- Renames `collect_tokens_trailing_token` as `collect_tokens`, because
  (a) it's annoying long, and (b) the `_trailing_token` bit is less
  accurate now that its types have changed.

- In `collect_tokens`, adds a `Option<CollectPos>` argument and a
  `UsePreAttrPos` in the return type of `f`. These are used in
  `parse_expr_force_collect` (for vanilla expressions) and in
  `parse_stmt_without_recovery` (for two different cases of expression
  statements). Together these ensure are enough to fix all the problems
  with token collection and assoc expressions. The changes to the
  `stringify.rs` test demonstrate some of these.

- Adds a new test. The code in this test was causing an assertion
  failure prior to this commit, due to an invalid `NodeRange`.

The extra complexity is annoying, but necessary to fix the existing
problems.
@nnethercote nnethercote force-pushed the fix-assoc-expr-collect-problems branch from 11ef8a4 to 9d31f86 Compare August 15, 2024 23:18
@nnethercote
Copy link
Contributor Author

I rebased.

@bors r=petrochenkov

@bors
Copy link
Contributor

bors commented Aug 15, 2024

📌 Commit 9d31f86 has been approved by petrochenkov

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 15, 2024
@bors
Copy link
Contributor

bors commented Aug 16, 2024

⌛ Testing commit 9d31f86 with merge be0ea0c...

@bors
Copy link
Contributor

bors commented Aug 16, 2024

☀️ Test successful - checks-actions
Approved by: petrochenkov
Pushing be0ea0c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 16, 2024
@bors bors merged commit be0ea0c into rust-lang:master Aug 16, 2024
7 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Aug 16, 2024
@nnethercote nnethercote deleted the fix-assoc-expr-collect-problems branch August 16, 2024 04:53
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (be0ea0c): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.3%] 5
Regressions ❌
(secondary)
0.3% [0.2%, 0.5%] 5
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.3% [0.2%, 0.3%] 5

Max RSS (memory usage)

Results (primary 1.8%, secondary 1.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.8% [1.8%, 1.8%] 1
Regressions ❌
(secondary)
1.9% [0.8%, 2.8%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.8% [1.8%, 1.8%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 750.562s -> 751.437s (0.12%)
Artifact size: 339.13 MiB -> 339.15 MiB (0.01%)

@nnethercote
Copy link
Contributor Author

I'm not sure if the regressions are real -- a pre-merge CI run showed fewer regressions. And if they are real, they are small and few, and this is fixing a correctness issue.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Aug 16, 2024
nnethercote added a commit to nnethercote/rust that referenced this pull request Aug 21, 2024
This example triggers an assertion failure:
```
fn f() -> u32 {
    #[cfg_eval] #[cfg(not(FALSE))] 0
}
```
The sequence of events:
- `configure_annotatable` calls `parse_expr_force_collect`, which calls
  `collect_tokens`.
- Within that, we end up in `parse_expr_dot_or_call`, which again calls
  `collect_tokens`.
  - The return value of the `f` call is the expression `0`.
  - This inner call collects tokens for `0` (parser range 10..11) and
    creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range
    0..11).
- We return to the outer `collect_tokens` call. The return value of the
  `f` call is *again* the expression `0`, again with the range 10..11,
  but the replacement from earlier covers the range 0..11. The code
  mistakenly assumes that any attributes from an inner `collect_tokens`
  call fit entirely within the body of the result of an outer
  `collect_tokens` call. So it adjusts the replacement parser range
  0..11 to a node range by subtracting 10, resulting in -10..1. This is
  an invalid range and triggers an assertion failure.

It's tricky to follow, but basically things get complicated when an AST
node is returned from an inner `collect_tokens` call and then returned
again from an outer `collect_token` node without being wrapped in any
kind of additional layer.

This commit changes `collect_tokens` to return early in some extra cases,
avoiding the construction of lazy tokens. In the example above, the
outer `collect_tokens` returns earlier because the `0` token already has
tokens and `self.capture_state.capturing` is `Capturing::No`. This early
return avoids the creation of the invalid range and the assertion
failure.

Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long
time. rust-lang#128725 looks like it's at fault only because it introduced the
assertion that catches the invalid ranges.
nnethercote added a commit to nnethercote/rust that referenced this pull request Aug 21, 2024
This example triggers an assertion failure:
```
fn f() -> u32 {
    #[cfg_eval] #[cfg(not(FALSE))] 0
}
```
The sequence of events:
- `configure_annotatable` calls `parse_expr_force_collect`, which calls
  `collect_tokens`.
- Within that, we end up in `parse_expr_dot_or_call`, which again calls
  `collect_tokens`.
  - The return value of the `f` call is the expression `0`.
  - This inner call collects tokens for `0` (parser range 10..11) and
    creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range
    0..11).
- We return to the outer `collect_tokens` call. The return value of the
  `f` call is *again* the expression `0`, again with the range 10..11,
  but the replacement from earlier covers the range 0..11. The code
  mistakenly assumes that any attributes from an inner `collect_tokens`
  call fit entirely within the body of the result of an outer
  `collect_tokens` call. So it adjusts the replacement parser range
  0..11 to a node range by subtracting 10, resulting in -10..1. This is
  an invalid range and triggers an assertion failure.

It's tricky to follow, but basically things get complicated when an AST
node is returned from an inner `collect_tokens` call and then returned
again from an outer `collect_token` node without being wrapped in any
kind of additional layer.

This commit changes `collect_tokens` to return early in some extra cases,
avoiding the construction of lazy tokens. In the example above, the
outer `collect_tokens` returns earlier because the `0` token already has
tokens and `self.capture_state.capturing` is `Capturing::No`. This early
return avoids the creation of the invalid range and the assertion
failure.

Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long
time. rust-lang#128725 looks like it's at fault only because it introduced the
assertion that catches the invalid ranges.
nnethercote added a commit to nnethercote/rust that referenced this pull request Aug 21, 2024
This example triggers an assertion failure:
```
fn f() -> u32 {
    #[cfg_eval] #[cfg(not(FALSE))] 0
}
```
The sequence of events:
- `configure_annotatable` calls `parse_expr_force_collect`, which calls
  `collect_tokens`.
- Within that, we end up in `parse_expr_dot_or_call`, which again calls
  `collect_tokens`.
  - The return value of the `f` call is the expression `0`.
  - This inner call collects tokens for `0` (parser range 10..11) and
    creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range
    0..11).
- We return to the outer `collect_tokens` call. The return value of the
  `f` call is *again* the expression `0`, again with the range 10..11,
  but the replacement from earlier covers the range 0..11. The code
  mistakenly assumes that any attributes from an inner `collect_tokens`
  call fit entirely within the body of the result of an outer
  `collect_tokens` call. So it adjusts the replacement parser range
  0..11 to a node range by subtracting 10, resulting in -10..1. This is
  an invalid range and triggers an assertion failure.

It's tricky to follow, but basically things get complicated when an AST
node is returned from an inner `collect_tokens` call and then returned
again from an outer `collect_token` node without being wrapped in any
kind of additional layer.

This commit changes `collect_tokens` to return early in some extra cases,
avoiding the construction of lazy tokens. In the example above, the
outer `collect_tokens` returns earlier because the `0` token already has
tokens and `self.capture_state.capturing` is `Capturing::No`. This early
return avoids the creation of the invalid range and the assertion
failure.

Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long
time. rust-lang#128725 looks like it's at fault only because it introduced the
assertion that catches the invalid ranges.
nnethercote added a commit to nnethercote/rust that referenced this pull request Aug 23, 2024
This example triggers an assertion failure:
```
fn f() -> u32 {
    #[cfg_eval] #[cfg(not(FALSE))] 0
}
```
The sequence of events:
- `configure_annotatable` calls `parse_expr_force_collect`, which calls
  `collect_tokens`.
- Within that, we end up in `parse_expr_dot_or_call`, which again calls
  `collect_tokens`.
  - The return value of the `f` call is the expression `0`.
  - This inner call collects tokens for `0` (parser range 10..11) and
    creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range
    0..11).
- We return to the outer `collect_tokens` call. The return value of the
  `f` call is *again* the expression `0`, again with the range 10..11,
  but the replacement from earlier covers the range 0..11. The code
  mistakenly assumes that any attributes from an inner `collect_tokens`
  call fit entirely within the body of the result of an outer
  `collect_tokens` call. So it adjusts the replacement parser range
  0..11 to a node range by subtracting 10, resulting in -10..1. This is
  an invalid range and triggers an assertion failure.

It's tricky to follow, but basically things get complicated when an AST
node is returned from an inner `collect_tokens` call and then returned
again from an outer `collect_token` node without being wrapped in any
kind of additional layer.

This commit changes `collect_tokens` to return early in some extra cases,
avoiding the construction of lazy tokens. In the example above, the
outer `collect_tokens` returns earlier because the `0` token already has
tokens and `self.capture_state.capturing` is `Capturing::No`. This early
return avoids the creation of the invalid range and the assertion
failure.

Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long
time. rust-lang#128725 looks like it's at fault only because it introduced the
assertion that catches the invalid ranges.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants