Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken match statement when compiling for thumbv6-none-eabi (ARM cortex-m0) #42248

Closed
alevy opened this issue May 26, 2017 · 16 comments · Fixed by #42740
Closed

Broken match statement when compiling for thumbv6-none-eabi (ARM cortex-m0) #42248

alevy opened this issue May 26, 2017 · 16 comments · Fixed by #42740
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@alevy
Copy link
Contributor

alevy commented May 26, 2017

After upgrading Tock to use a recent nightly, compilation for the NRF51-DK (a Cortex-M0 based board) results in unexpected assembly for a particular match statement.

The match statement looks like this:

impl kernel::Platform for Platform {
    fn with_driver<F, R>(&self, driver_num: usize, f: F) -> R
        where F: FnOnce(Option<&kernel::Driver>) -> R
    {
        match driver_num {
            0 => f(Some(self.console)),
            1 => f(Some(self.gpio)),
            3 => f(Some(self.timer)),
            8 => f(Some(self.led)),
            9 => f(Some(self.button)),
            14 => f(Some(self.rng)),
            17 => f(Some(self.aes)),
            36 => f(Some(self.temp)),
            _ => f(None),
        }
    }
}

The LLVM IR that (seems to) corresponds is:

bb49.i.i.i.i:                                     ; preds = %_ZN6kernel7process7Process18incr_syscall_count17h073864aec83cd68fE.exit
  %1192 = load volatile i32, i32* %1028, align 4
  switch i32 %1192, label %bb12.i.i.i1127 [
    i32 0, label %bb1.i.i.i.i.i
    i32 1, label %bb2.i.i.i.i.i
    i32 3, label %bb3.i.i.i.i.i
    i32 8, label %bb4.i57.i.i.i.i
    i32 9, label %bb5.i58.i.i.i.i
    i32 14, label %bb6.i.i.i.i.i
    i32 17, label %bb7.i.i.i.i.i
    i32 36, label %bb8.i.i.i.i.i
  ]

which is a fairly straight forward translation.

In general, the template for the expected output in assembly is something like:

;given r1 is `driver_num`, the variable we're matching against
lsls    r1, r1, #1
add     r1, pc
ldrh    r1, [r1, #4] ; load a half-word from the switch's offset table
lsls    r1, r1, #1 ; this half-word is actually 2x what it should be (idk why, just cause)
add     pc, r1 ; use the result as an offset from the current PC and jump there
                     ; (because we're executing a closure

The match statement seems to end up being broken up (makes sense since the matched values are sparse), with one of the cases (which is called at least when driver_num is 3) looking like this:

add     r1, pc, #16
lsls    r1, r1, #1
ldrh    r1, [r1, r1]
lsls    r1, r1, #1
add     pc, r1

This basically doesn't make any sense... it's grabbing a text segment address, multiplying it by two then loading a half-word from an address that's again multiplied by two (and at this point, completely outside the text segment so it could be anything really). In practice, that returns 0xffff, which after shifting left by one results in 0x1fffe. Finally, adding that to the current PC is way way way outside of the text segment and results in a hardware fault.

Notes

We can "fix" this in a variety of ways:

  • Commenting out the 17 valued branch (commenting out no other branch seems to do the trick)

  • Making the function inline(never)

  • Using opt-level 0 or 1

  • Compiling for better supported architectures by LLVM (e.g. thumbv7m-none-eabi) although those don't run on the same hardware so I can only confirm

Meta

This is using rustc 1.19.0-nightly (5b13bff52 2017-05-23)

Reproducing

Checkout commit d5565a227f8aa40e8c94efa9877a425003d02752 from helena-project/tock and:

$ make TOCK_BOARD=nrf51dk

the resulting binary will be boards/nrf51dk/target/thumbv6-none-eabi/release/nrf51dk, which you can read with objdump:

$ arm-none-eabi-objdump -d boards/nrf51dk/target/thumbv7-none-eabi/release/nrf51dk
@alevy
Copy link
Contributor Author

alevy commented May 26, 2017

/cc @rkruppe and @nagisa who helped debug this a bit on #rust-internals

@alevy
Copy link
Contributor Author

alevy commented May 26, 2017

The last nightly where this works as expected is rustc 1.17.0 (56124baa9 2017-04-24), which I believe is the last one before the upgrade to LLVM 4.0, which is at least some evidence that this is an LLVM bug.

@alevy
Copy link
Contributor Author

alevy commented May 26, 2017

I also compiled to thumb assembly the same LLVM-IR with llc from

  1. an older version of LLVM 3.9 and
  2. a newer version of LLVM from git.

LLVM 3.9 seems to use an entirely different mechanism for compiling switch statements to thumb assembly and recent git-LLVM seems to use the same strategy but doesn't result in the same strange compilation bug as described above.

I'm still figuring out if it's possible to link the result and run it on my hardware, but if I can confirm that (2) works as expected, it seems there may be an existing fix somewhere in the LLVM source tree between the 4.0 that Rust uses and bleeding edge.

@ppannuto
Copy link

As Tock is moving around this problem with the inline(never) fix, could you add a link to a commit hash that exhibits this issue?

@arielb1 arielb1 added I-wrong T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 29, 2017
@arielb1
Copy link
Contributor

arielb1 commented May 29, 2017

@ppannuto

LLVM bugs that cause scary miscompilations are not that rare, especially on non-x86 architectures. Normally, we work with the LLVM devs to get them fixed and backport the fixes to our LLVM version.

Can you upload LLVM IR that exhibits the problem?

@arielb1
Copy link
Contributor

arielb1 commented May 29, 2017

This change looks relevant:
llvm-mirror/llvm@a90b36e

It's not on our LLVM. If I had LLVM IR exhibiting the bug I could check whether it fixes the problem.

@arielb1
Copy link
Contributor

arielb1 commented May 29, 2017

Can confirm issue. Let me see whether the LLVM changes fix it.

@alevy
Copy link
Contributor Author

alevy commented May 29, 2017

@arielb1 you are my hero and savior!

@arielb1
Copy link
Contributor

arielb1 commented May 29, 2017

The bug is actually fixed by llvm-mirror/llvm@f4523b0 rather than the other commit. Now we only have to see what scary codegen bugs the other commits to that file fix.

@arielb1 arielb1 added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels May 29, 2017
@nikomatsakis
Copy link
Contributor

triage: P-high

@rust-highfive rust-highfive added P-high High priority and removed I-nominated labels Jun 8, 2017
@sanxiyn sanxiyn added the O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state label Jun 10, 2017
@brson
Copy link
Contributor

brson commented Jun 15, 2017

Will this be fixed by the llvm upgrade #42410?

@brson
Copy link
Contributor

brson commented Jun 15, 2017

@arielb1 says the upgrade does not include the fix.

arielb1 added a commit to arielb1/rust that referenced this issue Jun 18, 2017
So ARM had quite a few codegen bugs on LLVM 4.0 which are fixed on LLVM
trunk. This backports 5 of them:
r297871 - ARM: avoid clobbering register in v6 jump-table expansion.
    - fixes rust-lang#42248
r294949 - [Thumb-1] TBB generation: spot redefinitions of index
r295816 - [ARM] Fix constant islands pass.
r300870 - [Thumb-1] Fix corner cases for compressed jump tables
r302650 - [IfConversion] Add missing check in
IfConversion/canFallThroughTo
    - unblocks rust-lang#39409
bors added a commit that referenced this issue Jun 19, 2017
Backport fixes to LLVM 4.0 ARM codegen bugs

So ARM had quite a few codegen bugs on LLVM 4.0 which are fixed on LLVM
trunk. This backports 5 of them:
r297871 - ARM: avoid clobbering register in v6 jump-table expansion.
    - fixes #42248
r294949 - [Thumb-1] TBB generation: spot redefinitions of index
r295816 - [ARM] Fix constant islands pass.
r300870 - [Thumb-1] Fix corner cases for compressed jump tables
r302650 - [IfConversion] Add missing check in
IfConversion/canFallThroughTo
    - unblocks #39409

r? @alexcrichton
beta-nominating because this fixes regressions introduced by LLVM 4.0.
@arielb1
Copy link
Contributor

arielb1 commented Jun 19, 2017

Can you confirm the generated code works @alevy ? I can't read ARM that well.

@arielb1 arielb1 reopened this Jun 19, 2017
@alevy
Copy link
Contributor Author

alevy commented Jun 19, 2017

Appears to as far as I can test. I've not been able to reliably get a minimal, repeatable example of the bug, but on code that breaks with previous nightly doesn't break with this one, so seems fixed to me!

@arielb1 arielb1 closed this as completed Jun 19, 2017
@arielb1
Copy link
Contributor

arielb1 commented Jun 19, 2017

Yay!

alevy added a commit to alevy/tock that referenced this issue Jun 19, 2017
Rust nightly 06-19-17 included LLVM upstream fixes that address an issue
we were running into on Cortex-M0 targets.

rust-lang/rust#42248
alevy added a commit to tock/tock that referenced this issue Jun 21, 2017
Turns out this Rust upstream issue is not resolved (actually an LLVM
issue, but we thought it was just about upgrading the version of LLVM
Rust uses):

rust-lang/rust#42248

It seems as though the assembly _is_ different, but still incorrect
(generating `add pc, r1` instead of `mov r0, r1; br pc` which is why I
wasn't catching it...)
@alevy
Copy link
Contributor Author

alevy commented Jun 21, 2017

@arielb1 I take it back, I didn't test properly. It looks like the generated code is still incorrect. FWIW, this seems like a difficult thing to fix on the Rust side other than just regularly upgrading LLVM versions, unless we (I?) can identify more clearly why this is happening in the LLVM codegen and where it's been fixed. Sorry @arielb1, I misled you... :/

No no no! False alarm! I hadn't upgraded to the right Rust version when I pushed my changes to Tock upstream! It works! Sorry again!

brson pushed a commit to brson/rust that referenced this issue Jun 22, 2017
So ARM had quite a few codegen bugs on LLVM 4.0 which are fixed on LLVM
trunk. This backports 5 of them:
r297871 - ARM: avoid clobbering register in v6 jump-table expansion.
    - fixes rust-lang#42248
r294949 - [Thumb-1] TBB generation: spot redefinitions of index
r295816 - [ARM] Fix constant islands pass.
r300870 - [Thumb-1] Fix corner cases for compressed jump tables
r302650 - [IfConversion] Add missing check in
IfConversion/canFallThroughTo
    - unblocks rust-lang#39409
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants