Missed optimization: extra branch with slicing and integer overflow checks #58692

alex · 2019-02-24T05:57:45Z

Godbolt: https://rust.godbolt.org/z/PXY1lr

Basically the issue is that you compute s.position + n, check that it doesn't overflow, and then you verify that s.position + n is greater than s.position, which it of course always is, since you checked for an overflow! This is probably an LLVM optimization bug, but I'm starting here :-)

Command line arguments: -C opt-level=3 -C debug-assertions

Source:

pub struct S1<'a> {
    data: &'a [u8],
    position: usize,
}

pub fn f1<'a>(s: &'a mut S1, n: usize) -> &'a [u8] {
    let d = &s.data[s.position..s.position+n];
    s.position += n;
    return d;
}

ASM:

example::f1:
        push    rax
        mov     rax, qword ptr [rdi + 16]
        add     rsi, rax
        jb      .LBB0_4
        mov     rdx, rsi
        sub     rdx, rax
        jb      .LBB0_5
        mov     rcx, qword ptr [rdi + 8]
        cmp     rcx, rsi
        jb      .LBB0_6
        add     rax, qword ptr [rdi]
        mov     qword ptr [rdi + 16], rsi
        pop     rcx
        ret
.LBB0_4:
        lea     rdi, [rip + .L__unnamed_1]
        call    qword ptr [rip + _ZN4core9panicking5panic17h8f422ca74d186618E@GOTPCREL]
        ud2
.LBB0_5:
        mov     rdi, rax
        call    qword ptr [rip + _ZN4core5slice22slice_index_order_fail17h7ab7c9c6113ae4b8E@GOTPCREL]
        ud2
.LBB0_6:
        mov     rdi, rsi
        mov     rsi, rcx
        call    qword ptr [rip + _ZN4core5slice20slice_index_len_fail17h3a0f2c4c66447e54E@GOTPCREL]
        ud2

str.0:
        .ascii  "/tmp/compiler-explorer-compiler119124-62-359u5j.hrbs4/example.rs"

str.1:
        .ascii  "attempt to add with overflow"

.L__unnamed_1:
        .quad   str.1
        .quad   28
        .quad   str.0
        .quad   64
        .long   8
        .long   33

The text was updated successfully, but these errors were encountered:

nikic · 2019-02-24T09:06:02Z

Slightly reduced testcase:

define i64 @test(i64* %p, i64 %n) nounwind {
  %x = load i64, i64* %p
  %y = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 %x, i64 %n)
  %z = extractvalue { i64, i1 } %y, 1
  br i1 %z, label %bb3, label %bb1

bb1:
  %a = extractvalue { i64, i1 } %y, 0
  %b = icmp ult i64 %a, %x
  br i1 %b, label %bb3, label %bb2

bb2:
  store i64 %a, i64* %p
  ret i64 0

bb3:
  ret i64 1
}

declare { i64, i1 } @llvm.uadd.with.overflow.i64(i64, i64)

LLVM does not realize that %z and %b are the same here.

alex · 2019-02-24T16:59:56Z

Ok, looks like the thing to do is file an LLVM bug -- do you know off hand if they're ok with Rust reproducers, or do they prefer C ones?

alex · 2019-02-24T17:39:31Z

Ok, ended up converting to C because that seemed easy: https://bugs.llvm.org/show_bug.cgi?id=40845

nikic · 2019-02-24T18:26:29Z

Submitted https://bugs.llvm.org/show_bug.cgi?id=40846 additionally for the IR pattern above (which is an issue separate from the missing add.with.overflow+sub.with.overflow combining). Just verified that fixing this does indeed eliminate the slice_index_order_fail check in this example.

alex · 2019-11-03T00:52:34Z

@nikic am I correct in reading that this has been waiting on a review of the LLVM patch?

nikic · 2019-11-03T09:25:32Z

@alex Needs a rebase. More generally though, I think this is fighting a losing battle. uadd.with.overflow and usub.with.overflow are not considered canonical IR for a reason: They optimize badly in the middle end.

I think we'd be better off with generating the canonical variants instead (which are a + b; (a + b) < a and a - b; a < b). LLVM will still form uadd.with.overflow/usub.with.overflow during CGP, but this should optimize better in the middle end (including avoiding the issue here).

As we already have a good benchmark for the overhead of overflow checks in #58475, I think it might be worth trying to rerun that without using the unsigned overflow intrinsics.

comex · 2019-12-11T23:32:18Z

From the second LLVM bug above:

Fixed in llvm/llvm-project@8db5143.

I'm keeping this issue open, because the negated variation of this is not handled yet.

🎉

alex · 2019-12-12T02:08:34Z

🎉 once this makes it's way into Rust's LLVM (is there a good way to track this?) I'll verify that the assembly in this bug was improved, and if so I'll rebase the rustc patch and we can get some new numbers!

alex · 2020-05-22T15:15:51Z

Ok, confirmed that the recent LLVM10 update removes this branch: https://rust.godbolt.org/z/B38vDX

Shall this be closed?

Mark-Simulacrum · 2020-05-24T18:55:32Z

Actually, that was probably premature. We should add a codegen test.

alex · 2020-05-24T19:04:37Z

@Mark-Simulacrum is there a good example of what a test looks like? I can spin one up pretty easily if someone can point me at how to add one.

tmiasko · 2020-05-24T20:02:12Z

@alex Codegen tests are located in src/test/codegen. One test that is thematically related: slice-position-bounds-check.rs. They use LLVM FileCheck to verify generated LLVM IR.

Codegen tests can be run with:

./x.py test --stage 1 src/test/codegen

alex · 2020-05-24T20:20:36Z

Thanks much, that's exactly what I needed! PR is up now.

Added a codegen test for a recent optimization for overflow-checks=on Closes rust-lang#58692

alex mentioned this issue Feb 24, 2019

RFC changing the overflow behavior for usize in release builds to panic rust-lang/rfcs#2635

Closed

nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Feb 24, 2019

scottmcm added the WG-llvm Working group: LLVM backend code generation label Feb 26, 2019

alex mentioned this issue Mar 4, 2019

[WIP] Make usize overflow always have debug-assertions semantics #58475

Closed

Mark-Simulacrum closed this as completed May 24, 2020

Mark-Simulacrum reopened this May 24, 2020

Mark-Simulacrum added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label May 24, 2020

alex mentioned this issue May 24, 2020

Added a codegen test for a recent optimization for overflow-checks=on #72547

Merged

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue May 27, 2020

Rollup merge of rust-lang#72547 - alex:patch-1, r=oli-obk

e7d9396

Added a codegen test for a recent optimization for overflow-checks=on Closes rust-lang#58692

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue May 27, 2020

Rollup merge of rust-lang#72547 - alex:patch-1, r=oli-obk

ad6f565

Added a codegen test for a recent optimization for overflow-checks=on Closes rust-lang#58692

JohnTitor added a commit to JohnTitor/rust that referenced this issue May 28, 2020

Rollup merge of rust-lang#72547 - alex:patch-1, r=oli-obk

59b6887

Added a codegen test for a recent optimization for overflow-checks=on Closes rust-lang#58692

JohnTitor added a commit to JohnTitor/rust that referenced this issue May 29, 2020

Rollup merge of rust-lang#72547 - alex:patch-1, r=oli-obk

05229f7

Added a codegen test for a recent optimization for overflow-checks=on Closes rust-lang#58692

bors closed this as completed in cd5f228 May 29, 2020

alex mannequin mentioned this issue Sep 10, 2022

Missed optimization with redundant overflow checks llvm/llvm-project#40191

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missed optimization: extra branch with slicing and integer overflow checks #58692

Missed optimization: extra branch with slicing and integer overflow checks #58692

alex commented Feb 24, 2019

nikic commented Feb 24, 2019

alex commented Feb 24, 2019

alex commented Feb 24, 2019

nikic commented Feb 24, 2019

alex commented Nov 3, 2019

nikic commented Nov 3, 2019

comex commented Dec 11, 2019 •

edited

Loading

alex commented Dec 12, 2019

alex commented May 22, 2020

Mark-Simulacrum commented May 24, 2020

alex commented May 24, 2020

tmiasko commented May 24, 2020

alex commented May 24, 2020

Missed optimization: extra branch with slicing and integer overflow checks #58692

Missed optimization: extra branch with slicing and integer overflow checks #58692

Comments

alex commented Feb 24, 2019

nikic commented Feb 24, 2019

alex commented Feb 24, 2019

alex commented Feb 24, 2019

nikic commented Feb 24, 2019

alex commented Nov 3, 2019

nikic commented Nov 3, 2019

comex commented Dec 11, 2019 • edited Loading

alex commented Dec 12, 2019

alex commented May 22, 2020

Mark-Simulacrum commented May 24, 2020

alex commented May 24, 2020

tmiasko commented May 24, 2020

alex commented May 24, 2020

comex commented Dec 11, 2019 •

edited

Loading