-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problems with assoc expr token collection #128725
Fix problems with assoc expr token collection #128725
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…roblems, r=<try> Fix problems with assoc expr token collection There are several cases involving assoc exprs and attributes where the current code does the wrong thing. This PR adds some tests that demonstrate the problems and then fixes them. r? `@petrochenkov`
would you be willing to add some tests that you can invoke attribute proc macros where the input has attributes on expressions without needing nightly features? (I wasn't able to find a test for that, and that's already stable behavior that the library I'm writing depends on). e.g. something like // no nightly features needed
#[check_input_tokens]
fn f() {
#[a] a + #[b] b + #[c] c;
} |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (3a415f3): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -2.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 760.794s -> 761.826s (0.14%) |
Performance effects are negligible, which matches what I saw locally. This is good because this change does increase the amount of |
ce7eb36
to
11ef8a4
Compare
I have addressed the comments. I also tweaked the |
Let's do a crater run (to check if something like #128725 (comment) depends on the current behavior, in particular). |
…roblems, r=<try> Fix problems with assoc expr token collection There are several cases involving assoc exprs and attributes where the current code does the wrong thing. This PR adds some tests that demonstrate the problems and then fixes them. r? `@petrochenkov`
☀️ Try build successful - checks-actions |
@craterbot check |
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🎉 Experiment
|
🔒 Merge conflict This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again. How do I rebase?Assuming
You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial. Please avoid the "Resolve conflicts" button on GitHub. It uses Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Error message
|
It's not an important type when it comes to memory use.
This pre-existing type is suitable for use with the return value of the `f` parameter in `collect_tokens_trailing_token`. The more descriptive name will be useful because the next commit will add another boolean value to the return value of `f`.
A couple of these are marked `FIXME` because they demonstrate existing bugs with token collection.
This commit does the following. - Renames `collect_tokens_trailing_token` as `collect_tokens`, because (a) it's annoying long, and (b) the `_trailing_token` bit is less accurate now that its types have changed. - In `collect_tokens`, adds a `Option<CollectPos>` argument and a `UsePreAttrPos` in the return type of `f`. These are used in `parse_expr_force_collect` (for vanilla expressions) and in `parse_stmt_without_recovery` (for two different cases of expression statements). Together these ensure are enough to fix all the problems with token collection and assoc expressions. The changes to the `stringify.rs` test demonstrate some of these. - Adds a new test. The code in this test was causing an assertion failure prior to this commit, due to an invalid `NodeRange`. The extra complexity is annoying, but necessary to fix the existing problems.
11ef8a4
to
9d31f86
Compare
I rebased. @bors r=petrochenkov |
☀️ Test successful - checks-actions |
Finished benchmarking commit (be0ea0c): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 1.8%, secondary 1.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 750.562s -> 751.437s (0.12%) |
I'm not sure if the regressions are real -- a pre-merge CI run showed fewer regressions. And if they are real, they are small and few, and this is fixing a correctness issue. @rustbot label: +perf-regression-triaged |
This example triggers an assertion failure: ``` fn f() -> u32 { #[cfg_eval] #[cfg(not(FALSE))] 0 } ``` The sequence of events: - `configure_annotatable` calls `parse_expr_force_collect`, which calls `collect_tokens`. - Within that, we end up in `parse_expr_dot_or_call`, which again calls `collect_tokens`. - The return value of the `f` call is the expression `0`. - This inner call collects tokens for `0` (parser range 10..11) and creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range 0..11). - We return to the outer `collect_tokens` call. The return value of the `f` call is *again* the expression `0`, again with the range 10..11, but the replacement from earlier covers the range 0..11. The code mistakenly assumes that any attributes from an inner `collect_tokens` call fit entirely within the body of the result of an outer `collect_tokens` call. So it adjusts the replacement parser range 0..11 to a node range by subtracting 10, resulting in -10..1. This is an invalid range and triggers an assertion failure. It's tricky to follow, but basically things get complicated when an AST node is returned from an inner `collect_tokens` call and then returned again from an outer `collect_token` node without being wrapped in any kind of additional layer. This commit changes `collect_tokens` to return early in some extra cases, avoiding the construction of lazy tokens. In the example above, the outer `collect_tokens` returns earlier because the `0` token already has tokens and `self.capture_state.capturing` is `Capturing::No`. This early return avoids the creation of the invalid range and the assertion failure. Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long time. rust-lang#128725 looks like it's at fault only because it introduced the assertion that catches the invalid ranges.
This example triggers an assertion failure: ``` fn f() -> u32 { #[cfg_eval] #[cfg(not(FALSE))] 0 } ``` The sequence of events: - `configure_annotatable` calls `parse_expr_force_collect`, which calls `collect_tokens`. - Within that, we end up in `parse_expr_dot_or_call`, which again calls `collect_tokens`. - The return value of the `f` call is the expression `0`. - This inner call collects tokens for `0` (parser range 10..11) and creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range 0..11). - We return to the outer `collect_tokens` call. The return value of the `f` call is *again* the expression `0`, again with the range 10..11, but the replacement from earlier covers the range 0..11. The code mistakenly assumes that any attributes from an inner `collect_tokens` call fit entirely within the body of the result of an outer `collect_tokens` call. So it adjusts the replacement parser range 0..11 to a node range by subtracting 10, resulting in -10..1. This is an invalid range and triggers an assertion failure. It's tricky to follow, but basically things get complicated when an AST node is returned from an inner `collect_tokens` call and then returned again from an outer `collect_token` node without being wrapped in any kind of additional layer. This commit changes `collect_tokens` to return early in some extra cases, avoiding the construction of lazy tokens. In the example above, the outer `collect_tokens` returns earlier because the `0` token already has tokens and `self.capture_state.capturing` is `Capturing::No`. This early return avoids the creation of the invalid range and the assertion failure. Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long time. rust-lang#128725 looks like it's at fault only because it introduced the assertion that catches the invalid ranges.
This example triggers an assertion failure: ``` fn f() -> u32 { #[cfg_eval] #[cfg(not(FALSE))] 0 } ``` The sequence of events: - `configure_annotatable` calls `parse_expr_force_collect`, which calls `collect_tokens`. - Within that, we end up in `parse_expr_dot_or_call`, which again calls `collect_tokens`. - The return value of the `f` call is the expression `0`. - This inner call collects tokens for `0` (parser range 10..11) and creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range 0..11). - We return to the outer `collect_tokens` call. The return value of the `f` call is *again* the expression `0`, again with the range 10..11, but the replacement from earlier covers the range 0..11. The code mistakenly assumes that any attributes from an inner `collect_tokens` call fit entirely within the body of the result of an outer `collect_tokens` call. So it adjusts the replacement parser range 0..11 to a node range by subtracting 10, resulting in -10..1. This is an invalid range and triggers an assertion failure. It's tricky to follow, but basically things get complicated when an AST node is returned from an inner `collect_tokens` call and then returned again from an outer `collect_token` node without being wrapped in any kind of additional layer. This commit changes `collect_tokens` to return early in some extra cases, avoiding the construction of lazy tokens. In the example above, the outer `collect_tokens` returns earlier because the `0` token already has tokens and `self.capture_state.capturing` is `Capturing::No`. This early return avoids the creation of the invalid range and the assertion failure. Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long time. rust-lang#128725 looks like it's at fault only because it introduced the assertion that catches the invalid ranges.
This example triggers an assertion failure: ``` fn f() -> u32 { #[cfg_eval] #[cfg(not(FALSE))] 0 } ``` The sequence of events: - `configure_annotatable` calls `parse_expr_force_collect`, which calls `collect_tokens`. - Within that, we end up in `parse_expr_dot_or_call`, which again calls `collect_tokens`. - The return value of the `f` call is the expression `0`. - This inner call collects tokens for `0` (parser range 10..11) and creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range 0..11). - We return to the outer `collect_tokens` call. The return value of the `f` call is *again* the expression `0`, again with the range 10..11, but the replacement from earlier covers the range 0..11. The code mistakenly assumes that any attributes from an inner `collect_tokens` call fit entirely within the body of the result of an outer `collect_tokens` call. So it adjusts the replacement parser range 0..11 to a node range by subtracting 10, resulting in -10..1. This is an invalid range and triggers an assertion failure. It's tricky to follow, but basically things get complicated when an AST node is returned from an inner `collect_tokens` call and then returned again from an outer `collect_token` node without being wrapped in any kind of additional layer. This commit changes `collect_tokens` to return early in some extra cases, avoiding the construction of lazy tokens. In the example above, the outer `collect_tokens` returns earlier because the `0` token already has tokens and `self.capture_state.capturing` is `Capturing::No`. This early return avoids the creation of the invalid range and the assertion failure. Fixes rust-lang#129166. Note: these invalid ranges have been happening for a long time. rust-lang#128725 looks like it's at fault only because it introduced the assertion that catches the invalid ranges.
There are several cases involving assoc exprs and attributes where the current code does the wrong thing. This PR adds some tests that demonstrate the problems and then fixes them.
r? @petrochenkov