Rustc fails to optimize a common option usage pattern #68667

Pzixel · 2020-01-30T12:28:57Z

Consider following functions

pub fn unwrap_combinators(a: Option<i32>, b: i32) -> bool {
    a.map(|t| t >= b)
     .unwrap_or(false)
}

pub fn unwrap_manual(a: Option<i32>, b: i32) -> bool {
    match a {
        Some(t) => t >= b,
        None => false
    }
}

The first pattern is what we often write and the second one is the most efficient manually unrolled version. Surprisingly rustc fails to optimize the former one into the latter as you can see in godbolt listing:

example::unwrap_combinators:
        xor     eax, eax
        cmp     edx, esi
        setle   al
        test    edi, edi
        mov     ecx, 2
        cmovne  ecx, eax
        cmp     cl, 2
        setne   al
        and     al, cl
        ret

example::unwrap_manual:
        test    edi, edi
        setne   cl
        cmp     esi, edx
        setge   al
        and     al, cl
        ret

P.S. Yes, I'm aware of map_or

The text was updated successfully, but these errors were encountered:

Aaron1011 · 2020-01-30T21:17:26Z

It looks like Rust's niche-filling optimization is defeating LLVM's optimizations.

In unwrap_combinators, the temporary Option<bool> uses the niche in bool for the discriminant, resulting in the following layout:

0 == Some(false)
1 === Some(true)
2 == None

LLVM ends up inlining the calls to map and unwrap_or, but is unable to optimize the particular combination of icmp and select that ends up in the LLVM IR.

This can be seen by changing the type to a tuple of (bool, ()), which inhibits Rust's niche-filling optimization: (playground):

pub fn unwrap_combinators(a: Option<i32>, b: i32) -> bool {
    a.map(|t| (t >= b, ()))
     .unwrap_or((false, ()))
     .0
}

pub fn unwrap_manual(a: Option<i32>, b: i32) -> bool {
    match a {
        Some(t) => t >= b,
        None => false
    }
}

which generates the following ASM:

playground::unwrap_combinators:
	testl	%edi, %edi
	setne	%cl
	cmpl	%esi, %edx
	setle	%al
	andb	%cl, %al
	retq

playground::unwrap_manual:
	testl	%edi, %edi
	setne	%cl
	cmpl	%edx, %esi
	setge	%al
	andb	%cl, %al
	retq

LLVM has all of the information it needs to optimize the original IR - however, it doesn't seem to have an instcombine special-case that would allow it to do so.

Unfortunately, neither map nor unwrap_or gets MIR-inlined with -Z mir-opt-level=3, due to both of them having too high of a computed cost:

[DEBUG rustc_mir::transform::inline] checking whether to inline callsite CallSite { callee: DefId(2:5043 ~ core[ca81]::option[0]::{{impl}}[0]::map[0]), substs: [i32, bool, [closure@slow.rs:3:11: 3:21 b:&i32]], bb: bb0, location: SourceInfo { span: slow.rs:3:5: 3:22, scope: scope[0] } }
[DEBUG rustc_mir::transform::inline] consider_optimizing(CallSite { callee: DefId(2:5043 ~ core[ca81]::option[0]::{{impl}}[0]::map[0]), substs: [i32, bool, [closure@slow.rs:3:11: 3:21 b:&i32]], bb: bb0, location: SourceInfo { span: slow.rs:3:5: 3:22, scope: scope[0] } })
[DEBUG rustc_mir::transform::inline] should_inline(CallSite { callee: DefId(2:5043 ~ core[ca81]::option[0]::{{impl}}[0]::map[0]), substs: [i32, bool, [closure@slow.rs:3:11: 3:21 b:&i32]], bb: bb0, location: SourceInfo { span: slow.rs:3:5: 3:22, scope: scope[0] } })
[DEBUG rustc_mir::transform::inline]     final inline threshold = 100
[DEBUG rustc_mir::transform::inline] NOT inlining CallSite { callee: DefId(2:5043 ~ core[ca81]::option[0]::{{impl}}[0]::map[0]), substs: [i32, bool, [closure@slow.rs:3:11: 3:21 b:&i32]], bb: bb0, location: SourceInfo { span: slow.rs:3:5: 3:22, scope: scope[0] } } [cost=204 > threshold=100]
[DEBUG rustc_mir::transform::inline] checking whether to inline callsite CallSite { callee: DefId(2:5040 ~ core[ca81]::option[0]::{{impl}}[0]::unwrap_or[0]), substs: [bool], bb: bb1, location: SourceInfo { span: slow.rs:3:5: 4:23, scope: scope[0] } }
[DEBUG rustc_mir::transform::inline] consider_optimizing(CallSite { callee: DefId(2:5040 ~ core[ca81]::option[0]::{{impl}}[0]::unwrap_or[0]), substs: [bool], bb: bb1, location: SourceInfo { span: slow.rs:3:5: 4:23, scope: scope[0] } })
[DEBUG rustc_mir::transform::inline] should_inline(CallSite { callee: DefId(2:5040 ~ core[ca81]::option[0]::{{impl}}[0]::unwrap_or[0]), substs: [bool], bb: bb1, location: SourceInfo { span: slow.rs:3:5: 4:23, scope: scope[0] } })
[DEBUG rustc_mir::transform::inline]     final inline threshold = 100
[DEBUG rustc_mir::transform::inline] NOT inlining CallSite { callee: DefId(2:5040 ~ core[ca81]::option[0]::{{impl}}[0]::unwrap_or[0]), substs: [bool], bb: bb1, location: SourceInfo { span: slow.rs:3:5: 4:23, scope: scope[0] } } [cost=123 > threshold=100]

The fact that LLVM decides to inline these functions suggets that we might be overly conservative in how we calculating inlining cost.

Hopefully, this situation will be improved by #68528, which specifically calls out unwrap_or as having improved MIR generation.

ecstatic-morse · 2020-01-30T21:31:55Z

@Aaron1011 I can confirm that, with #68528, unwrap_or becomes eligible for MIR inlining in unwrap_combinators, and the two functions compile to the same assembly with -Z mir-opt-level=3.

felix91gr · 2020-04-27T04:44:04Z

Should this issue be closed now? Since it's been solved by #68528. Or maybe since it's still under mir-opt-3, and therefore unstable, it's still worth left open? :)

ecstatic-morse · 2020-04-27T20:02:29Z

This won't be fixed until MIR inlining becomes more usable and should remain open. I would like to see an "I-slow-fixed-by-MIR-inlining" tag so issues like this and #66234 can be triaged more efficiently.

felix91gr · 2020-04-28T06:21:27Z

That makes a lot of sense 🙂

Kobzol · 2022-08-09T13:51:59Z

It looks like the code is now optimized properly in recent nightly: https://rust.godbolt.org/z/sMhr83EMo, maybe because of enabled MIR inlining. Maybe we could add a codegen test for this?

Add codegen tests for E-needs-test close rust-lang#36010 close rust-lang#68667 close rust-lang#74938 close rust-lang#83585 close rust-lang#93036 close rust-lang#109328 close rust-lang#110797 close rust-lang#111508 close rust-lang#112509 close rust-lang#113757 close rust-lang#120440 close rust-lang#118392 close rust-lang#71096 r? nikic

jonas-schievink added C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 30, 2020

jonas-schievink added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-mir-opt Area: MIR optimizations A-layout Area: Memory layout of types labels Mar 29, 2020

ecstatic-morse added the A-mir-opt-inlining Area: MIR inlining label Apr 27, 2020

cjgillot added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Jul 19, 2023

tesuji added a commit to tesuji/rustc that referenced this issue May 20, 2024

add codegen test for rust-lang#68667

78a4c24

tesuji mentioned this issue May 20, 2024

Add codegen tests for E-needs-test #125347

Merged

tesuji added a commit to tesuji/rustc that referenced this issue May 20, 2024

add codegen test for rust-lang#68667

3a54d86

tesuji added a commit to tesuji/rustc that referenced this issue Jun 8, 2024

add codegen test for rust-lang#68667

9f3fcd1

tesuji added a commit to tesuji/rustc that referenced this issue Jun 9, 2024

add codegen test for rust-lang#68667

5f527eb

bors closed this as completed in 7ac6c2f Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rustc fails to optimize a common option usage pattern #68667

Rustc fails to optimize a common option usage pattern #68667

Pzixel commented Jan 30, 2020 •

edited

Loading

Aaron1011 commented Jan 30, 2020 •

edited

Loading

ecstatic-morse commented Jan 30, 2020

felix91gr commented Apr 27, 2020

ecstatic-morse commented Apr 27, 2020

felix91gr commented Apr 28, 2020

Kobzol commented Aug 9, 2022

Rustc fails to optimize a common option usage pattern #68667

Rustc fails to optimize a common option usage pattern #68667

Comments

Pzixel commented Jan 30, 2020 • edited Loading

Aaron1011 commented Jan 30, 2020 • edited Loading

ecstatic-morse commented Jan 30, 2020

felix91gr commented Apr 27, 2020

ecstatic-morse commented Apr 27, 2020

felix91gr commented Apr 28, 2020

Kobzol commented Aug 9, 2022

Pzixel commented Jan 30, 2020 •

edited

Loading

Aaron1011 commented Jan 30, 2020 •

edited

Loading