Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix off-by-one spans in MIR borrowck errors #47420

Merged
merged 9 commits into from
Jan 27, 2018

Conversation

davidtwco
Copy link
Member

@davidtwco davidtwco commented Jan 13, 2018

Fixes #46885.

r? @nikomatsakis

// Attribute scope exit drops to scope's closing brace
let scope_end = region_scope_span.with_lo(region_scope_span.hi());
// Attribute scope exit drops to scope's closing brace.
// Without this check when finding the endpoint, we'll run into an ICE.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be somehow handled in end_point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to have a check in end_point that makes sure it doesn't cause any overflow when doing self.hi().0 - 1.

I'm not entirely sure why the MIR borrow checker has any spans where that would be an issue (particularly since the AST borrow checker doesn't need this check). I only noticed this when compiling rustc_tsan in the std artifacts compilation step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is weird. Could you make that change so that any other code calling end_point doesn't need to worry about this?

Also, if you could post the ICE that happens in rustc_tsan it would be great, as I am intrigued at which piece of code it was triggering this (my guess is that it just made an existing bug apparent).

Copy link
Member Author

@davidtwco davidtwco Jan 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm already compiling a change that moves that check into end_point as I write this.

The only error it gave me that is left in my scrollback is below, which isn't very useful, sorry about that:

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.25.0-dev running on x86_64-unknown-linux-gnu

thread 'rustc' panicked at 'attempt to subtract with overflow', libsyntax_pos/lib.rs:222:27
note: Run with `RUST_BACKTRACE=1` for a backtrace.

I didn't bother to look into it further than that when I confirmed it was my change that caused it. I did notice though that of the four instances of the ICE in my scrollback, they occured in different modules: rustc_lsan, rustc_asan, compiler_builtins and rustc_tsan - each run was probably with minor adjustments to my code to try work out what was happening.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I'll r+ once that last change is in (and CI has run successfully).

@estebank
Copy link
Contributor

The following was removed from the PR description:

I had two ui test errors in ui/explain.rs and ui/lint/use_suggestion_json.rs that seemed entirely unrelated to my changes so I assumed they were something strange with my local environment.

I've seen that as well and your guess is correct. I haven't looked into how to avoid that issue yet.

@estebank
Copy link
Contributor

estebank commented Jan 14, 2018

It looks like an assertion in bytepos_to_file_charpos checking wether a position falls in the middle of a wide character is failing in debuginfo/multi-byte-chars.rs:

[01:02:27] ---- [debuginfo-gdb] debuginfo/multi-byte-chars.rs stdout ----
[01:02:27] 	NOTE: compiletest thinks it is using GDB without native rust support
[01:02:27] 
[01:02:27] error: compilation failed!
[01:02:27] status: exit code: 101
[01:02:27] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/debuginfo/multi-byte-chars.rs" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo" "--target=x86_64-unknown-linux-gnu" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo/multi-byte-chars.stage2-x86_64-unknown-linux-gnu" "-Crpath" "-Zmiri" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-g" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo/multi-byte-chars.stage2-x86_64-unknown-linux-gnu.gdb.aux"
[01:02:27] thread 'main' panicked at 'Some tests failed', tools/compiletest/src/main.rs:476:22
[01:02:27] stdout:
[01:02:27] ------------------------------------------
[01:02:27] 
[01:02:27] ------------------------------------------
[01:02:27] stderr:
[01:02:27] ------------------------------------------
[01:02:27] error: internal compiler error: unexpected panic
[01:02:27] 
[01:02:27] note: the compiler unexpectedly panicked. this is a bug.
[01:02:27] 
[01:02:27] note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
[01:02:27] 
[01:02:27] note: rustc 1.25.0-dev running on x86_64-unknown-linux-gnu
[01:02:27] 
[01:02:27] thread 'rustc' panicked at 'assertion failed: bpos.to_usize() >= mbc.pos.to_usize() + mbc.bytes', libsyntax/codemap.rs:613:17
[01:02:27] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[01:02:27] 
[01:02:27] 
[01:02:27] ------------------------------------------
[01:02:27] 
[01:02:27] thread '[debuginfo-gdb] debuginfo/multi-byte-chars.rs' panicked at 'explicit panic', tools/compiletest/src/runtest.rs:2884:9
[01:02:27] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[01:02:27] 
[01:02:27] 
[01:02:27] failures:
[01:02:27]     [debuginfo-gdb] debuginfo/multi-byte-chars.rs
[01:02:27] 
[01:02:27] test result: FAILED. 84 passed; 1 failed; 24 ignored; 0 measured; 0 filtered out

It seems like we might have resurrected #18791.

What's happening is that end_point is trying to point to the second half of θ. This... is annoying. We have to pass the tcx.sess.codemap() to both end_point and next_point so that they can verify the char width. If it is a wide unicode character, the span's end should be start + width for next_point and end_point's start has to be end - width. This is probably gonna affect quite a bit of code, but this is a bug lurking throughout the compiler that will blow up if non_ascii_idents becomes stable. Would you mind making that change?

@kennytm kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 14, 2018
@oli-obk
Copy link
Contributor

oli-obk commented Jan 14, 2018

I've seen that as well and your guess is correct. I haven't looked into how to avoid that issue yet.

That's just a stage 1 issue. Should be gone with the next snapshot. There are no error explanations in stage 1 since the system changed

@@ -219,7 +219,9 @@ impl Span {
/// Returns a new span representing just the end-point of this span
pub fn end_point(self) -> Span {
let span = self.data();
let lo = cmp::max(span.hi.0 - 1, span.lo.0);
// We can avoid an ICE by checking if subtraction would cause an overflow.
let hi = if span.hi.0 == u32::min_value() { span.hi.0 } else { span.hi.0 - 1 };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is also checked_sub -- e.g.,

let hi = span.hi.0.checked_sub(1).unwrap_or(span.hi.0);

@davidtwco
Copy link
Member Author

davidtwco commented Jan 14, 2018

I've pushed a fix for the multibyte characters. After discussion with @nikomatsakis on Gitter, I originally attempted to walk backwards from the BytePos until the start of the character, but I struggled to get this implemented so I took an alternative approach - not sure how good an approach this is or how performant it is.

end_point and next_point had to be moved into CodeMap since it cannot be used within Span without running into cyclic dependency issues.

@estebank
Copy link
Contributor

estebank commented Jan 15, 2018

@bors r+

Great work!

@bors
Copy link
Contributor

bors commented Jan 15, 2018

📌 Commit 0ed07d2 has been approved by estebank

@rust-lang rust-lang deleted a comment from bors Jan 15, 2018
@rust-lang rust-lang deleted a comment from bors Jan 15, 2018
let map = &(*files)[idx];

for mbc in map.multibyte_chars.borrow().iter() {
if mbc.pos < bpos {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm a bit concerned about the performance impact of this code. It seems to be O(n) in the position of the file, and I don't really think there's a good reason for this, right? Also, this is not on a "slow path", it happens during the core of borrow checking.

OTOH, I guess that in practice -- due to the fact that most files have no multibyte-characters -- it won't really be noticeable.

I'm a bit curious though to know why the "subtract one and go further back if needed" strategy didn't work out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd probably be able to get the "subtract one and go further back if needed" strategy working, but when I implemented it, I made a logic error, it would compile but subsequent compiles would fall into an infinite loop. In a subsequent attempt, I took the approach currently in the PR.

I'd be happy to take another go at it if you'd like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed the "subtract one and go further back if needed" version.

@nikomatsakis
Copy link
Contributor

lgtm, it'd be nice if we could assert that spans are well-formed, but that seems like a separate issue

Copy link
Contributor

@estebank estebank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one nitpick that shouldn't be a problem in practice, but I'd like to fix before merging.

// Disregard malformed spans and assume a one-byte wide character.
if sp.lo() > sp.hi() {
return 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you trigger this check anywhere? I'd like to keep it, but there have been efforts to avoid creating malformed spans in the first place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't create any spans that would trigger this. I know that if I attempt to assert that lo < hi then it fails when compiling the compiler, so I think this check is necessary for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should definitely keep the check. Checks like this are peppered throughout the compiler because the compiler is making bad spans. Filed #47504 with my thoughts on the matter.

let width = self.find_width_of_character_at_span(sp, true);
let corrected_next_position = pos.checked_add(width).unwrap_or(pos);

let next_point = BytePos(cmp::max(sp.hi().0, corrected_next_position));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to account for the (low) chance that next_point might also be a wide character, as next_point could potentially be used anywhere, including at the start of an ident (in practice this might never happen, and even if it does, the presentation would be ok, but would break havoc on tools depending on offsets).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the second parameter of find_width_of_character_at_span that searches forward for the character boundary rather than backwards not handle this or am I misunderstanding?

Copy link
Contributor

@estebank estebank Jan 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of this code is that in the code println!("☃☃") if you have a span pointing at println!( and use next_point, the new span will point at the first ". If you call next_point on that span, it would point to 0xE2, when it should actually be pointing at 0xE2 0x98 0x83. This happens because of the line below, where you create the span with the same start and end. Your code correctly handles the case where your span points to the inside of the text 0xE2 0x98 0x83 0xE2 0x98 0x83 and you call next_point, yielding the second ". Does that make sense?

You would have to call find_with_of_character_at_span again to get the end point. In practice, this wouldn't be a problem for rustc, only for external tools trying to use the spans to perform changes in the code (such as in a suggestion, but we shouldn't be creating new spans on suggestions).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original version of the next_point function it returned a span with the same start and end, wouldn't it have had the same issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes :)

Copy link
Member Author

@davidtwco davidtwco Jan 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand, what change is required? Nevermind, I see now. The previous version had the issue because it didn't handle multibyte characters, next_point should point to the whole of the multibyte character and therefore in cases with those, not return the same span for lo and hi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that on a follow up PR. I'll approve once ci is happy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already got it added. Only had one little thing remaining yesterday but it was getting late.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! Thank you for all the work you put into this! I know that the user facing change is not that big given the effort, but you're fixing quite a few potential pernicious ICEs :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed up the fix for this.

@davidtwco
Copy link
Member Author

davidtwco commented Jan 16, 2018

Forgot to run the ui tests after the change to the multibyte character handling today (and never realised that Travis was having troubles) so never noticed that something in the that change caused the spans to go back a position, will fix that and update that last commit.

Resolved this, apologies for the delay.

@davidtwco
Copy link
Member Author

davidtwco commented Jan 17, 2018

Noticed that the tests failed, which is strange - they all ran fine on my machine. Will look into it.

Rebased and fixed a new test that was added that this affects. Should work now but haven't had an opportunity to run all the tests again locally.

@nikomatsakis
Copy link
Contributor

@davidtwco
Copy link
Member Author

davidtwco commented Jan 17, 2018

@nikomatsakis working on it, seems like it is getting some spans that have the same hi and lo.

Resolved this. Was running into an issue with the compile-fail/enum-discrim-too-small2.rs test failing, it didn't seem related so I've pushed what fixed the previous errors.

@estebank
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Jan 17, 2018

📌 Commit a1b72f7 has been approved by estebank

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jan 19, 2018
bors added a commit that referenced this pull request Jan 19, 2018
Rollup of 8 pull requests

- Successful merges: #46938, #47334, #47420, #47508, #47510, #47512, #47535, #47559
- Failed merges:
@nikomatsakis nikomatsakis added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 19, 2018
@nikomatsakis
Copy link
Contributor

r? @estebank

@bors
Copy link
Contributor

bors commented Jan 27, 2018

💔 Test failed - status-travis

@estebank
Copy link
Contributor

https://travis-ci.org/rust-lang/rust/jobs/333941636#L7276

[01:13:14] test cargo_fail_with_no_stderr has been running for over 60 seconds
No output has been received in the last 30m0s, this potentially indicates a stalled build or something wrong with the build itself.

@bors retry

@alexcrichton
Copy link
Member

@bors: r-

Oh I think this is the same as #47572 (comment), a legitimate infinite loop

@estebank
Copy link
Contributor

@alexcrichton do we have a reason for it?

@alexcrichton
Copy link
Member

@estebank I was able to reproduce it awhile back in the linked comment there (it's a reduction of a test case in Cargo)

@davidtwco
Copy link
Member Author

I'll look into this, apologies.

@davidtwco
Copy link
Member Author

davidtwco commented Jan 27, 2018

Infinite loop issue should now be resolved.

@alexcrichton
Copy link
Member

alexcrichton commented Jan 27, 2018

@bors: r=estebank

no worries, thanks @davidtwco!

@bors
Copy link
Contributor

bors commented Jan 27, 2018

📌 Commit 0bd9667 has been approved by estebnk

@bors
Copy link
Contributor

bors commented Jan 27, 2018

💡 This pull request was already approved, no need to approve it again.

@bors
Copy link
Contributor

bors commented Jan 27, 2018

📌 Commit 0bd9667 has been approved by estebank

@bors
Copy link
Contributor

bors commented Jan 27, 2018

⌛ Testing commit 0bd9667 with merge 7d6e5b9...

bors added a commit that referenced this pull request Jan 27, 2018
Fix off-by-one spans in MIR borrowck errors

Fixes #46885.

r? @nikomatsakis
@bors
Copy link
Contributor

bors commented Jan 27, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: estebank
Pushing 7d6e5b9 to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants