Fix off-by-one spans in MIR borrowck errors #47420

davidtwco · 2018-01-13T23:32:30Z

estebank · 2018-01-13T23:52:05Z

src/librustc_mir/build/scope.rs

-                // Attribute scope exit drops to scope's closing brace
-                let scope_end = region_scope_span.with_lo(region_scope_span.hi());
+                // Attribute scope exit drops to scope's closing brace.
+                // Without this check when finding the endpoint, we'll run into an ICE.


Should this be somehow handled in end_point?

I think it would make sense to have a check in end_point that makes sure it doesn't cause any overflow when doing self.hi().0 - 1.

I'm not entirely sure why the MIR borrow checker has any spans where that would be an issue (particularly since the AST borrow checker doesn't need this check). I only noticed this when compiling rustc_tsan in the std artifacts compilation step.

It is weird. Could you make that change so that any other code calling end_point doesn't need to worry about this?

Also, if you could post the ICE that happens in rustc_tsan it would be great, as I am intrigued at which piece of code it was triggering this (my guess is that it just made an existing bug apparent).

I'm already compiling a change that moves that check into end_point as I write this.

The only error it gave me that is left in my scrollback is below, which isn't very useful, sorry about that:

error: internal compiler error: unexpected panic note: the compiler unexpectedly panicked. this is a bug. note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports note: rustc 1.25.0-dev running on x86_64-unknown-linux-gnu thread 'rustc' panicked at 'attempt to subtract with overflow', libsyntax_pos/lib.rs:222:27 note: Run with `RUST_BACKTRACE=1` for a backtrace.

I didn't bother to look into it further than that when I confirmed it was my change that caused it. I did notice though that of the four instances of the ICE in my scrollback, they occured in different modules: rustc_lsan, rustc_asan, compiler_builtins and rustc_tsan - each run was probably with minor adjustments to my code to try work out what was happening.

Cool. I'll r+ once that last change is in (and CI has run successfully).

estebank · 2018-01-13T23:53:32Z

The following was removed from the PR description:

I had two ui test errors in ui/explain.rs and ui/lint/use_suggestion_json.rs that seemed entirely unrelated to my changes so I assumed they were something strange with my local environment.

I've seen that as well and your guess is correct. I haven't looked into how to avoid that issue yet.

estebank · 2018-01-14T04:31:08Z

It looks like an assertion in bytepos_to_file_charpos checking wether a position falls in the middle of a wide character is failing in debuginfo/multi-byte-chars.rs:

[01:02:27] ---- [debuginfo-gdb] debuginfo/multi-byte-chars.rs stdout ----
[01:02:27] 	NOTE: compiletest thinks it is using GDB without native rust support
[01:02:27] 
[01:02:27] error: compilation failed!
[01:02:27] status: exit code: 101
[01:02:27] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/debuginfo/multi-byte-chars.rs" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo" "--target=x86_64-unknown-linux-gnu" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo/multi-byte-chars.stage2-x86_64-unknown-linux-gnu" "-Crpath" "-Zmiri" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-g" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/debuginfo/multi-byte-chars.stage2-x86_64-unknown-linux-gnu.gdb.aux"
[01:02:27] thread 'main' panicked at 'Some tests failed', tools/compiletest/src/main.rs:476:22
[01:02:27] stdout:
[01:02:27] ------------------------------------------
[01:02:27] 
[01:02:27] ------------------------------------------
[01:02:27] stderr:
[01:02:27] ------------------------------------------
[01:02:27] error: internal compiler error: unexpected panic
[01:02:27] 
[01:02:27] note: the compiler unexpectedly panicked. this is a bug.
[01:02:27] 
[01:02:27] note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
[01:02:27] 
[01:02:27] note: rustc 1.25.0-dev running on x86_64-unknown-linux-gnu
[01:02:27] 
[01:02:27] thread 'rustc' panicked at 'assertion failed: bpos.to_usize() >= mbc.pos.to_usize() + mbc.bytes', libsyntax/codemap.rs:613:17
[01:02:27] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[01:02:27] 
[01:02:27] 
[01:02:27] ------------------------------------------
[01:02:27] 
[01:02:27] thread '[debuginfo-gdb] debuginfo/multi-byte-chars.rs' panicked at 'explicit panic', tools/compiletest/src/runtest.rs:2884:9
[01:02:27] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[01:02:27] 
[01:02:27] 
[01:02:27] failures:
[01:02:27]     [debuginfo-gdb] debuginfo/multi-byte-chars.rs
[01:02:27] 
[01:02:27] test result: FAILED. 84 passed; 1 failed; 24 ignored; 0 measured; 0 filtered out

It seems like we might have resurrected #18791.

What's happening is that end_point is trying to point to the second half of θ. This... is annoying. We have to pass the tcx.sess.codemap() to both end_point and next_point so that they can verify the char width. If it is a wide unicode character, the span's end should be start + width for next_point and end_point's start has to be end - width. This is probably gonna affect quite a bit of code, but this is a bug lurking throughout the compiler that will blow up if non_ascii_idents becomes stable. Would you mind making that change?

oli-obk · 2018-01-14T09:36:40Z

I've seen that as well and your guess is correct. I haven't looked into how to avoid that issue yet.

That's just a stage 1 issue. Should be gone with the next snapshot. There are no error explanations in stage 1 since the system changed

nikomatsakis · 2018-01-14T12:19:19Z

src/libsyntax_pos/lib.rs

@@ -219,7 +219,9 @@ impl Span {
    /// Returns a new span representing just the end-point of this span
    pub fn end_point(self) -> Span {
        let span = self.data();
-        let lo = cmp::max(span.hi.0 - 1, span.lo.0);
+        // We can avoid an ICE by checking if subtraction would cause an overflow.
+        let hi = if span.hi.0 == u32::min_value() { span.hi.0 } else { span.hi.0 - 1 };


there is also checked_sub -- e.g.,

let hi = span.hi.0.checked_sub(1).unwrap_or(span.hi.0);

davidtwco · 2018-01-14T17:33:59Z

I've pushed a fix for the multibyte characters. After discussion with @nikomatsakis on Gitter, I originally attempted to walk backwards from the BytePos until the start of the character, but I struggled to get this implemented so I took an alternative approach - not sure how good an approach this is or how performant it is.

end_point and next_point had to be moved into CodeMap since it cannot be used within Span without running into cyclic dependency issues.

estebank · 2018-01-15T01:45:55Z

@bors r+

Great work!

bors · 2018-01-15T01:45:56Z

📌 Commit 0ed07d2 has been approved by estebank

nikomatsakis · 2018-01-16T18:46:23Z

src/libsyntax/codemap.rs

+        let map = &(*files)[idx];
+
+        for mbc in map.multibyte_chars.borrow().iter() {
+            if mbc.pos < bpos {


Hmm, I'm a bit concerned about the performance impact of this code. It seems to be O(n) in the position of the file, and I don't really think there's a good reason for this, right? Also, this is not on a "slow path", it happens during the core of borrow checking.

OTOH, I guess that in practice -- due to the fact that most files have no multibyte-characters -- it won't really be noticeable.

I'm a bit curious though to know why the "subtract one and go further back if needed" strategy didn't work out.

I think I'd probably be able to get the "subtract one and go further back if needed" strategy working, but when I implemented it, I made a logic error, it would compile but subsequent compiles would fall into an infinite loop. In a subsequent attempt, I took the approach currently in the PR.

I'd be happy to take another go at it if you'd like.

Pushed the "subtract one and go further back if needed" version.

nikomatsakis · 2018-01-16T20:44:20Z

lgtm, it'd be nice if we could assert that spans are well-formed, but that seems like a separate issue

estebank

Only one nitpick that shouldn't be a problem in practice, but I'd like to fix before merging.

estebank · 2018-01-16T21:49:45Z

src/libsyntax/codemap.rs

+        // Disregard malformed spans and assume a one-byte wide character.
+        if sp.lo() > sp.hi() {
+            return 1;
+        }


Did you trigger this check anywhere? I'd like to keep it, but there have been efforts to avoid creating malformed spans in the first place.

I don't create any spans that would trigger this. I know that if I attempt to assert that lo < hi then it fails when compiling the compiler, so I think this check is necessary for now.

Yes, we should definitely keep the check. Checks like this are peppered throughout the compiler because the compiler is making bad spans. Filed #47504 with my thoughts on the matter.

estebank · 2018-01-16T21:53:13Z

src/libsyntax/codemap.rs

+        let width = self.find_width_of_character_at_span(sp, true);
+        let corrected_next_position = pos.checked_add(width).unwrap_or(pos);
+
+        let next_point = BytePos(cmp::max(sp.hi().0, corrected_next_position));


You also need to account for the (low) chance that next_point might also be a wide character, as next_point could potentially be used anywhere, including at the start of an ident (in practice this might never happen, and even if it does, the presentation would be ok, but would break havoc on tools depending on offsets).

Does the second parameter of find_width_of_character_at_span that searches forward for the character boundary rather than backwards not handle this or am I misunderstanding?

My understanding of this code is that in the code println!("☃☃") if you have a span pointing at println!( and use next_point, the new span will point at the first ". If you call next_point on that span, it would point to 0xE2, when it should actually be pointing at 0xE2 0x98 0x83. This happens because of the line below, where you create the span with the same start and end. Your code correctly handles the case where your span points to the inside of the text 0xE2 0x98 0x83 0xE2 0x98 0x83 and you call next_point, yielding the second ". Does that make sense?

You would have to call find_with_of_character_at_span again to get the end point. In practice, this wouldn't be a problem for rustc, only for external tools trying to use the spans to perform changes in the code (such as in a suggestion, but we shouldn't be creating new spans on suggestions).

In the original version of the next_point function it returned a span with the same start and end, wouldn't it have had the same issue?

~~I'm not sure I understand, what change is required?~~ Nevermind, I see now. The previous version had the issue because it didn't handle multibyte characters, next_point should point to the whole of the multibyte character and therefore in cases with those, not return the same span for lo and hi.

We can do that on a follow up PR. I'll approve once ci is happy.

I've already got it added. Only had one little thing remaining yesterday but it was getting late.

No problem! Thank you for all the work you put into this! I know that the user facing change is not that big given the effort, but you're fixing quite a few potential pernicious ICEs :)

Pushed up the fix for this.

davidtwco · 2018-01-16T23:02:17Z

Forgot to run the ui tests after the change to the multibyte character handling today (and never realised that Travis was having troubles) so never noticed that something in the that change caused the spans to go back a position, will fix that and update that last commit.

Resolved this, apologies for the delay.

davidtwco · 2018-01-17T11:46:46Z

~~Noticed that the tests failed, which is strange - they all ran fine on my machine. Will look into it.~~

Rebased and fixed a new test that was added that this affects. Should work now but haven't had an opportunity to run all the tests again locally.

nikomatsakis · 2018-01-17T16:12:41Z

@davidtwco hmm, seeing some ICEs in the travis tests

davidtwco · 2018-01-17T16:13:52Z

~~@nikomatsakis working on it, seems like it is getting some spans that have the same hi and lo.~~

Resolved this. Was running into an issue with the compile-fail/enum-discrim-too-small2.rs test failing, it didn't seem related so I've pushed what fixed the previous errors.

estebank · 2018-01-17T19:20:53Z

@bors r+

bors · 2018-01-17T19:20:54Z

📌 Commit a1b72f7 has been approved by estebank

@nikomatsakis

Fix off-by-one spans in MIR borrowck errors Fixes rust-lang#46885. r? @nikomatsakis

Rollup of 8 pull requests - Successful merges: #46938, #47334, #47420, #47508, #47510, #47512, #47535, #47559 - Failed merges:

nikomatsakis · 2018-01-19T21:48:11Z

r? @estebank

bors · 2018-01-27T00:28:08Z

💔 Test failed - status-travis

estebank · 2018-01-27T00:41:15Z

https://travis-ci.org/rust-lang/rust/jobs/333941636#L7276

[01:13:14] test cargo_fail_with_no_stderr has been running for over 60 seconds
No output has been received in the last 30m0s, this potentially indicates a stalled build or something wrong with the build itself.

@bors retry

alexcrichton · 2018-01-27T00:45:54Z

@bors: r-

Oh I think this is the same as #47572 (comment), a legitimate infinite loop

estebank · 2018-01-27T01:06:29Z

@alexcrichton do we have a reason for it?

alexcrichton · 2018-01-27T03:14:08Z

@estebank I was able to reproduce it awhile back in the linked comment there (it's a reduction of a test case in Cargo)

davidtwco · 2018-01-27T11:05:21Z

I'll look into this, apologies.

…more performant variant.

davidtwco · 2018-01-27T13:31:14Z

Infinite loop issue should now be resolved.

alexcrichton · 2018-01-27T18:36:56Z

@bors: r=estebank

no worries, thanks @davidtwco!

bors · 2018-01-27T18:36:57Z

📌 Commit 0bd9667 has been approved by estebnk

bors · 2018-01-27T18:37:02Z

💡 This pull request was already approved, no need to approve it again.

There's another pull request that is currently being tested, blocking this pull request: Make region inference use a dirty list #47766

bors · 2018-01-27T18:37:02Z

📌 Commit 0bd9667 has been approved by estebank

bors · 2018-01-27T19:41:47Z

⌛ Testing commit 0bd9667 with merge 7d6e5b9...

@nikomatsakis

Fix off-by-one spans in MIR borrowck errors Fixes #46885. r? @nikomatsakis

bors · 2018-01-27T22:41:40Z

☀️ Test successful - status-appveyor, status-travis
Approved by: estebank
Pushing 7d6e5b9 to master...

rust-highfive assigned nikomatsakis Jan 13, 2018

davidtwco mentioned this pull request Jan 13, 2018

Off-by-one spans in MIR borrowck errors #46885

Closed

estebank reviewed Jan 13, 2018

View reviewed changes

kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 14, 2018

nikomatsakis reviewed Jan 14, 2018

View reviewed changes

rust-lang deleted a comment from bors Jan 15, 2018

nikomatsakis reviewed Jan 16, 2018

View reviewed changes

estebank approved these changes Jan 16, 2018

View reviewed changes

davidtwco force-pushed the issue-46885 branch from 0f88bc4 to 7d54eaf Compare January 17, 2018 10:03

davidtwco force-pushed the issue-46885 branch from 7d54eaf to 9d4ca01 Compare January 17, 2018 14:29

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jan 19, 2018

Rollup merge of rust-lang#47420 - davidtwco:issue-46885, r=estebank

ac75587

Fix off-by-one spans in MIR borrowck errors Fixes rust-lang#46885. r? @nikomatsakis

GuillaumeGomez mentioned this pull request Jan 19, 2018

Rollup of 8 pull requests #47572

Closed

bors added a commit that referenced this pull request Jan 19, 2018

Auto merge of #47572 - GuillaumeGomez:rollup, r=GuillaumeGomez

a7b8622

Rollup of 8 pull requests - Successful merges: #46938, #47334, #47420, #47508, #47510, #47512, #47535, #47559 - Failed merges:

nikomatsakis added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 19, 2018

davidtwco added 9 commits January 27, 2018 11:46

Updated tests with fixed span location.

6d00c96

Fixed off-by-one spans in MIR borrowck errors.

f6fee2a

Moved overflow check into end_point function.

c6e6428

end_point handling multibyte characters correctly.

c71cec8

Replaced multi-byte character handling in end_point with potentially …

6235647

…more performant variant.

next_point now handles creating spans over multibyte characters.

be465b0

Fix new test from rebase.

71b7500

Now handling case where span has same lo and hi.

0c467d5

Fixed infinite loop issues and added some improved logging.

0bd9667

davidtwco force-pushed the issue-46885 branch from a1b72f7 to 0bd9667 Compare January 27, 2018 13:30

bors added a commit that referenced this pull request Jan 27, 2018

Auto merge of #47420 - davidtwco:issue-46885, r=estebank

7d6e5b9

Fix off-by-one spans in MIR borrowck errors Fixes #46885. r? @nikomatsakis

bors merged commit 0bd9667 into rust-lang:master Jan 27, 2018

Mark-Simulacrum mentioned this pull request Feb 12, 2018

Compiler slowdown on lage module in nightly #48153

Closed

davidtwco deleted the issue-46885 branch February 25, 2018 12:18

ehuss mentioned this pull request May 29, 2019

… in comment cause unexpected rustc panic #61226

Closed

ehuss mentioned this pull request Oct 18, 2022

Fix the bug of next_point in source_map #103185

Merged

Fix off-by-one spans in MIR borrowck errors #47420

Fix off-by-one spans in MIR borrowck errors #47420

Conversation

davidtwco commented Jan 13, 2018 • edited by estebank Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidtwco Jan 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

estebank commented Jan 13, 2018

estebank commented Jan 14, 2018 • edited Loading

oli-obk commented Jan 14, 2018

Choose a reason for hiding this comment

davidtwco commented Jan 14, 2018 • edited Loading

estebank commented Jan 15, 2018 • edited Loading

bors commented Jan 15, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Jan 16, 2018

estebank left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

estebank Jan 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidtwco Jan 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidtwco commented Jan 16, 2018 • edited Loading

davidtwco commented Jan 17, 2018 • edited Loading

nikomatsakis commented Jan 17, 2018

davidtwco commented Jan 17, 2018 • edited Loading

estebank commented Jan 17, 2018

bors commented Jan 17, 2018

nikomatsakis commented Jan 19, 2018

bors commented Jan 27, 2018

estebank commented Jan 27, 2018

alexcrichton commented Jan 27, 2018

estebank commented Jan 27, 2018

alexcrichton commented Jan 27, 2018

davidtwco commented Jan 27, 2018

davidtwco commented Jan 27, 2018 • edited Loading

alexcrichton commented Jan 27, 2018 • edited Loading

bors commented Jan 27, 2018

bors commented Jan 27, 2018

bors commented Jan 27, 2018

bors commented Jan 27, 2018

bors commented Jan 27, 2018

davidtwco commented Jan 13, 2018 •

edited by estebank

Loading

davidtwco Jan 14, 2018 •

edited

Loading

estebank commented Jan 14, 2018 •

edited

Loading

davidtwco commented Jan 14, 2018 •

edited

Loading

estebank commented Jan 15, 2018 •

edited

Loading

estebank Jan 16, 2018 •

edited

Loading

davidtwco Jan 16, 2018 •

edited

Loading

davidtwco commented Jan 16, 2018 •

edited

Loading

davidtwco commented Jan 17, 2018 •

edited

Loading

davidtwco commented Jan 17, 2018 •

edited

Loading

davidtwco commented Jan 27, 2018 •

edited

Loading

alexcrichton commented Jan 27, 2018 •

edited

Loading