Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization opportunity when Vec::push follows Vec::pop #114334

Closed
krtab opened this issue Aug 1, 2023 · 5 comments · Fixed by #114370
Closed

Missed optimization opportunity when Vec::push follows Vec::pop #114334

krtab opened this issue Aug 1, 2023 · 5 comments · Fixed by #114370
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@krtab
Copy link
Contributor

krtab commented Aug 1, 2023

The following code

pub fn noop(v: &mut Vec<u8>) {
    if let Some(x) = v.pop() {
        v.push(x)
    }
}

should be optimized to a no-op afaict, but instead gives:

example::noop:
        push    rbp
        push    rbx
        push    rax
        mov     rcx, qword ptr [rdi + 16]
        test    rcx, rcx
        je      .LBB2_4
        mov     rbx, rdi
        lea     rsi, [rcx - 1]
        mov     qword ptr [rdi + 16], rsi
        mov     rax, qword ptr [rdi]
        movzx   ebp, byte ptr [rax + rcx - 1]
        cmp     rsi, qword ptr [rdi + 8]
        jne     .LBB2_3
        mov     rdi, rbx
        call    alloc::raw_vec::RawVec<T,A>::reserve_for_push
        mov     rax, qword ptr [rbx]
        mov     rsi, qword ptr [rbx + 16]
.LBB2_3:
        mov     byte ptr [rax + rsi], bpl
        inc     rsi
        mov     qword ptr [rbx + 16], rsi
.LBB2_4:
        add     rsp, 8
        pop     rbx
        pop     rbp
        ret

(this is true on stable, beta and nightly).

I think a blocking point here is that rustc/LLVM do not manage to infer the invariant:

pub fn noop(v: &mut Vec<u8>) {
     // v.len <= v.capacity
     if let Some(x) = v.pop() {
        // v.len < v.capacity
        v.push(x)
    }
}

Note that the version with added asserts for this invariant removes the useless conditional call to reserve_for_push but instead gets useless calls to assert failure code.

pub fn noop(v: &mut Vec<u8>) {
    assert!(v.len() <= v.capacity());
    if let Some(x) = v.pop() {
        assert!(v.len() < v.capacity());
        v.push(x)
    }
}
example::noop:
        push    rax
        mov     rcx, qword ptr [rdi + 8]
        mov     rax, qword ptr [rdi + 16]
        cmp     rax, rcx
        ja      .LBB0_6
        test    rax, rax
        je      .LBB0_4
        lea     rdx, [rax - 1]
        mov     qword ptr [rdi + 16], rdx
        cmp     rdx, rcx
        jae     .LBB0_5
        mov     qword ptr [rdi + 16], rax
.LBB0_4:
        pop     rax
        ret
.LBB0_6:
        lea     rdi, [rip + .L__unnamed_1]
        lea     rdx, [rip + .L__unnamed_2]
        mov     esi, 41
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2
.LBB0_5:
        lea     rdi, [rip + .L__unnamed_3]
        lea     rdx, [rip + .L__unnamed_4]
        mov     esi, 40
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2

.L__unnamed_3:
        .ascii  "assertion failed: v.len() < v.capacity()"

.L__unnamed_5:
        .ascii  "/app/example.rs"

.L__unnamed_4:
        .quad   .L__unnamed_5
        .asciz  "\017\000\000\000\000\000\000\000\025\000\000\000\t\000\000"

.L__unnamed_1:
        .ascii  "assertion failed: v.len() <= v.capacity()"

.L__unnamed_2:
        .quad   .L__unnamed_5
        .asciz  "\017\000\000\000\000\000\000\000\023\000\000\000\005\000\000"
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 1, 2023
@saethlin saethlin added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Aug 2, 2023
@cynecx
Copy link
Contributor

cynecx commented Aug 2, 2023

I think there are two issues here:

  • the invariant self.len <= self.cap isn't encoded anywhere inside the push and pop methods
  • llvm's constraint elimination pass (in llvm 16) isn't smart enough to figure these things out (even with the assumes)

The first issue can be fixed by adding assume(self.len <= self.cap)s to push and pop. The second one is somewhat tricky, however we are fortunate that this has been improved somewhat already in upstream llvm:

https://godbolt.org/z/b5sc6T7ze (c++ reproducer, so it's not directly visible that the constraint elimination pass got improved)
https://godbolt.org/z/eq69G11xf (llvm-ir)

@krtab
Copy link
Contributor Author

krtab commented Aug 2, 2023

Thanks for investigating!

Unfortunately your code is equivalent to this Rust code which gets correctly optimized (https://godbolt.org/z/s4ansTsaG):

pub fn noop(v: &mut Vec<()>) {
    if let Some(x) = v.pop() {
        v.push(x)
    }
}

I'm not proficient in C++ but I'll try to find an example closer to Rust.

@krtab
Copy link
Contributor Author

krtab commented Aug 2, 2023

This gets correctly optimized, I'll try a PR to add the assume to Vec::pop and see how it does on benchmarks.

#![feature(core_intrinsics)]

use std::intrinsics::assume;


fn pop<T>(v: &mut Vec<T>) -> Option<T> {
    let res = v.pop();
    if res.is_some() {
        unsafe {assume(v.len() < v.capacity());}
    }
    res
}

pub fn noop(v: &mut Vec<u8>) {
    let tmp = pop(v);
    match tmp {
        Some(x) => v.push(x),
        None => ()
    }
}

@SUPERCILEX
Copy link
Contributor

Just ran into #105156 which seems similar

@SUPERCILEX
Copy link
Contributor

Also #82801

bors added a commit to rust-lang-ci/rust that referenced this issue Oct 16, 2023
Add invariant to Vec::pop that len < cap if pop successful

Fixes: rust-lang#114334
@bors bors closed this as completed in 0bcac8a Oct 16, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 26, 2024
…cap, r=<try>

Add invariant to VecDeque::pop_* that len < cap if pop successful

Similar to rust-lang#114370 for VecDeque instead of Vec.

I initially come from rust-itertools/itertools#899 where we noticed that `pop_front;push_back;` was slower than expected so `@scottmcm` suggested I file an issue which lead to https://internals.rust-lang.org/t/vecdeque-pop-front-push-back/20483 where **kornel** mentionned rust-lang#114334 (fixed by rust-lang#114370).

This is my first time with codegen tests, I based the test on what was done for Vec.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 27, 2024
…e_cap, r=Nilstrieb

Add invariant to VecDeque::pop_* that len < cap if pop successful

Similar to rust-lang#114370 for VecDeque instead of Vec.

I initially come from rust-itertools/itertools#899 where we noticed that `pop_front;push_back;` was slower than expected so `@scottmcm` suggested I file an issue which lead to https://internals.rust-lang.org/t/vecdeque-pop-front-push-back/20483 where **kornel** mentionned rust-lang#114334 (fixed by rust-lang#114370).

This is my first time with codegen tests, I based the test on what was done for Vec.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 8, 2024
…e_cap, r=Nilstrieb

Add invariant to VecDeque::pop_* that len < cap if pop successful

Similar to rust-lang#114370 for VecDeque instead of Vec.

I initially come from rust-itertools/itertools#899 where we noticed that `pop_front;push_back;` was slower than expected so `@scottmcm` suggested I file an issue which lead to https://internals.rust-lang.org/t/vecdeque-pop-front-push-back/20483 where **kornel** mentionned rust-lang#114334 (fixed by rust-lang#114370).

This is my first time with codegen tests, I based the test on what was done for Vec.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 8, 2024
Rollup merge of rust-lang#123089 - Philippe-Cholet:vecdeque_pop_assume_cap, r=Nilstrieb

Add invariant to VecDeque::pop_* that len < cap if pop successful

Similar to rust-lang#114370 for VecDeque instead of Vec.

I initially come from rust-itertools/itertools#899 where we noticed that `pop_front;push_back;` was slower than expected so `@scottmcm` suggested I file an issue which lead to https://internals.rust-lang.org/t/vecdeque-pop-front-push-back/20483 where **kornel** mentionned rust-lang#114334 (fixed by rust-lang#114370).

This is my first time with codegen tests, I based the test on what was done for Vec.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants