-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix off-by-one spans in MIR borrowck errors #47420
Conversation
src/librustc_mir/build/scope.rs
Outdated
// Attribute scope exit drops to scope's closing brace | ||
let scope_end = region_scope_span.with_lo(region_scope_span.hi()); | ||
// Attribute scope exit drops to scope's closing brace. | ||
// Without this check when finding the endpoint, we'll run into an ICE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be somehow handled in end_point
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to have a check in end_point
that makes sure it doesn't cause any overflow when doing self.hi().0 - 1
.
I'm not entirely sure why the MIR borrow checker has any spans where that would be an issue (particularly since the AST borrow checker doesn't need this check). I only noticed this when compiling rustc_tsan
in the std artifacts compilation step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is weird. Could you make that change so that any other code calling end_point
doesn't need to worry about this?
Also, if you could post the ICE that happens in rustc_tsan
it would be great, as I am intrigued at which piece of code it was triggering this (my guess is that it just made an existing bug apparent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm already compiling a change that moves that check into end_point
as I write this.
The only error it gave me that is left in my scrollback is below, which isn't very useful, sorry about that:
error: internal compiler error: unexpected panic
note: the compiler unexpectedly panicked. this is a bug.
note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
note: rustc 1.25.0-dev running on x86_64-unknown-linux-gnu
thread 'rustc' panicked at 'attempt to subtract with overflow', libsyntax_pos/lib.rs:222:27
note: Run with `RUST_BACKTRACE=1` for a backtrace.
I didn't bother to look into it further than that when I confirmed it was my change that caused it. I did notice though that of the four instances of the ICE in my scrollback, they occured in different modules: rustc_lsan
, rustc_asan
, compiler_builtins
and rustc_tsan
- each run was probably with minor adjustments to my code to try work out what was happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I'll r+ once that last change is in (and CI has run successfully).
The following was removed from the PR description:
I've seen that as well and your guess is correct. I haven't looked into how to avoid that issue yet. |
It looks like an assertion in
It seems like we might have resurrected #18791. What's happening is that |
That's just a stage 1 issue. Should be gone with the next snapshot. There are no error explanations in stage 1 since the system changed |
src/libsyntax_pos/lib.rs
Outdated
@@ -219,7 +219,9 @@ impl Span { | |||
/// Returns a new span representing just the end-point of this span | |||
pub fn end_point(self) -> Span { | |||
let span = self.data(); | |||
let lo = cmp::max(span.hi.0 - 1, span.lo.0); | |||
// We can avoid an ICE by checking if subtraction would cause an overflow. | |||
let hi = if span.hi.0 == u32::min_value() { span.hi.0 } else { span.hi.0 - 1 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is also checked_sub
-- e.g.,
let hi = span.hi.0.checked_sub(1).unwrap_or(span.hi.0);
I've pushed a fix for the multibyte characters. After discussion with @nikomatsakis on Gitter, I originally attempted to walk backwards from the
|
@bors r+ Great work! |
📌 Commit 0ed07d2 has been approved by |
src/libsyntax/codemap.rs
Outdated
let map = &(*files)[idx]; | ||
|
||
for mbc in map.multibyte_chars.borrow().iter() { | ||
if mbc.pos < bpos { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm a bit concerned about the performance impact of this code. It seems to be O(n) in the position of the file, and I don't really think there's a good reason for this, right? Also, this is not on a "slow path", it happens during the core of borrow checking.
OTOH, I guess that in practice -- due to the fact that most files have no multibyte-characters -- it won't really be noticeable.
I'm a bit curious though to know why the "subtract one and go further back if needed" strategy didn't work out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd probably be able to get the "subtract one and go further back if needed" strategy working, but when I implemented it, I made a logic error, it would compile but subsequent compiles would fall into an infinite loop. In a subsequent attempt, I took the approach currently in the PR.
I'd be happy to take another go at it if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed the "subtract one and go further back if needed" version.
lgtm, it'd be nice if we could assert that spans are well-formed, but that seems like a separate issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one nitpick that shouldn't be a problem in practice, but I'd like to fix before merging.
src/libsyntax/codemap.rs
Outdated
// Disregard malformed spans and assume a one-byte wide character. | ||
if sp.lo() > sp.hi() { | ||
return 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you trigger this check anywhere? I'd like to keep it, but there have been efforts to avoid creating malformed spans in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't create any spans that would trigger this. I know that if I attempt to assert that lo < hi
then it fails when compiling the compiler, so I think this check is necessary for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should definitely keep the check. Checks like this are peppered throughout the compiler because the compiler is making bad spans. Filed #47504 with my thoughts on the matter.
src/libsyntax/codemap.rs
Outdated
let width = self.find_width_of_character_at_span(sp, true); | ||
let corrected_next_position = pos.checked_add(width).unwrap_or(pos); | ||
|
||
let next_point = BytePos(cmp::max(sp.hi().0, corrected_next_position)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to account for the (low) chance that next_point
might also be a wide character, as next_point
could potentially be used anywhere, including at the start of an ident (in practice this might never happen, and even if it does, the presentation would be ok, but would break havoc on tools depending on offsets).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the second parameter of find_width_of_character_at_span
that searches forward for the character boundary rather than backwards not handle this or am I misunderstanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of this code is that in the code println!("☃☃")
if you have a span pointing at println!(
and use next_point
, the new span will point at the first "
. If you call next_point
on that span, it would point to 0xE2
, when it should actually be pointing at 0xE2 0x98 0x83
. This happens because of the line below, where you create the span with the same start and end. Your code correctly handles the case where your span points to the inside of the text 0xE2 0x98 0x83 0xE2 0x98 0x83
and you call next_point
, yielding the second "
. Does that make sense?
You would have to call find_with_of_character_at_span
again to get the end point. In practice, this wouldn't be a problem for rustc
, only for external tools trying to use the spans to perform changes in the code (such as in a suggestion, but we shouldn't be creating new spans on suggestions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the original version of the next_point
function it returned a span with the same start and end, wouldn't it have had the same issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand, what change is required? Nevermind, I see now. The previous version had the issue because it didn't handle multibyte characters, next_point
should point to the whole of the multibyte character and therefore in cases with those, not return the same span for lo
and hi
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that on a follow up PR. I'll approve once ci is happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've already got it added. Only had one little thing remaining yesterday but it was getting late.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem! Thank you for all the work you put into this! I know that the user facing change is not that big given the effort, but you're fixing quite a few potential pernicious ICEs :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed up the fix for this.
Resolved this, apologies for the delay. |
0f88bc4
to
7d54eaf
Compare
Rebased and fixed a new test that was added that this affects. Should work now but haven't had an opportunity to run all the tests again locally. |
7d54eaf
to
9d4ca01
Compare
Resolved this. Was running into an issue with the |
@bors r+ |
📌 Commit a1b72f7 has been approved by |
Fix off-by-one spans in MIR borrowck errors Fixes rust-lang#46885. r? @nikomatsakis
r? @estebank |
💔 Test failed - status-travis |
https://travis-ci.org/rust-lang/rust/jobs/333941636#L7276
@bors retry |
@bors: r- Oh I think this is the same as #47572 (comment), a legitimate infinite loop |
@alexcrichton do we have a reason for it? |
@estebank I was able to reproduce it awhile back in the linked comment there (it's a reduction of a test case in Cargo) |
I'll look into this, apologies. |
…more performant variant.
a1b72f7
to
0bd9667
Compare
Infinite loop issue should now be resolved. |
@bors: r=estebank no worries, thanks @davidtwco! |
📌 Commit 0bd9667 has been approved by |
💡 This pull request was already approved, no need to approve it again.
|
📌 Commit 0bd9667 has been approved by |
Fix off-by-one spans in MIR borrowck errors Fixes #46885. r? @nikomatsakis
☀️ Test successful - status-appveyor, status-travis |
Fixes #46885.
r? @nikomatsakis