[Experiment] Eliminate possible `Vec::push` branches #121300

c410-f3r · 2024-02-19T15:08:04Z

Related to #105156. Requesting a perf run.

pub fn push(v: &mut Vec<u8>) {
    let _ = v.reserve(4);
    v.push(1);
    v.push(2);
    v.push(3);
    v.push(4);
}

AFAICT, the codegen backend should infer the infallibility of these pushs but unfortunately with LLVM 18 we still have unnecessary reserve_for_push branches.

For the sake of curiosity, assert_unchecked was included in push to see any potential impact of such change. Take a look at the generated assembly at https://godbolt.org/z/b5jjPhsf8.

AFAICT (again), the assumption of more available capacity for each push is not valid for all situations.

rustbot · 2024-02-19T15:08:13Z

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

library/alloc/src/vec/mod.rs

Noratrieb · 2024-02-19T15:26:44Z

fixed it for you, it should work now
@bors try @rust-timer queue

bors · 2024-02-19T15:28:52Z

⌛ Trying commit e64d4ba with merge ba16851...

[Experiment] Eliminate possible `Vec::push` branches Related to rust-lang#105156. Requesting a perf run. ```rust pub fn push(v: &mut Vec<u8>) { let _ = v.reserve(4); v.push(1); v.push(2); v.push(3); v.push(4); } ``` AFAICT, the codegen backend should infer the infallibility of these `push`s but unfortunately with LLVM 18 we still have unnecessary `reserve_for_push` branches. For the sake of curiosity, `assert_unchecked` was included in `push` to see any potential impact of such change. Take a look at the generated assembly at https://godbolt.org/z/b5jjPhsf8. AFAICT (again), the assumption of more available capacity for each `push` is not valid for all situations.

bors · 2024-02-19T16:56:39Z

☀️ Try build successful - checks-actions
Build commit: ba16851 (ba168511861d72f06ae72268ed19ac3d62f126b2)

rust-timer · 2024-02-20T00:32:29Z

Finished benchmarking commit (ba16851): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.4%, 1.3%]	13
Regressions ❌ (secondary)	0.3%	[0.2%, 0.4%]	3
Improvements ✅ (primary)	-0.3%	[-0.4%, -0.2%]	13
Improvements ✅ (secondary)	-1.0%	[-1.9%, -0.3%]	17
All ❌✅ (primary)	0.2%	[-0.4%, 1.3%]	26

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	4.7%	[0.1%, 12.7%]	6
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.7%	[-7.7%, -0.7%]	3
Improvements ✅ (secondary)	-2.5%	[-2.5%, -2.5%]	1
All ❌✅ (primary)	1.9%	[-7.7%, 12.7%]	9

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.0%	[0.8%, 1.2%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.0%	[0.8%, 1.2%]	2

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.0%, 2.2%]	58
Regressions ❌ (secondary)	0.7%	[0.2%, 2.5%]	5
Improvements ✅ (primary)	-0.1%	[-0.2%, -0.0%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[-0.2%, 2.2%]	63

Bootstrap: 641.758s -> 640.665s (-0.17%)
Artifact size: 308.80 MiB -> 308.77 MiB (-0.01%)

c410-f3r · 2024-02-20T00:34:59Z

Thanks @Nilstrieb

cynecx · 2024-02-20T03:33:13Z

library/alloc/src/vec/mod.rs

@@ -1925,6 +1929,7 @@ impl<T, A: Allocator> Vec<T, A> {
            let end = self.as_mut_ptr().add(self.len);
            ptr::write(end, value);
            self.len += 1;


@c410-f3r

Just as a note: LLVM's constraint elimination can work better when this is written as:

// rustc emits: `add nuw`, so the pass can assume that the instruction does not overflow self.len = unsafe { self.len.unchecked_add(1) };

This reduces some calls to reserve_for_push but does not eliminate all of them. I haven't really had the time to dig in further to find out...

Attempt to optimize Vec::push

b1b8615

rustbot assigned Mark-Simulacrum Feb 19, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 19, 2024

This comment has been minimized.

Sign in to view

Tidy

ffd66d9

This comment has been minimized.

Sign in to view

Noratrieb reviewed Feb 19, 2024

View reviewed changes

library/alloc/src/vec/mod.rs Outdated Show resolved Hide resolved

Update library/alloc/src/vec/mod.rs

e64d4ba

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 19, 2024

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 20, 2024

c410-f3r closed this Feb 20, 2024

cynecx reviewed Feb 20, 2024

View reviewed changes

c410-f3r mentioned this pull request Feb 20, 2024

Vec::reserve(n) followed by n calls to push should only check capacity once #105156

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Experiment] Eliminate possible `Vec::push` branches #121300

[Experiment] Eliminate possible `Vec::push` branches #121300

c410-f3r commented Feb 19, 2024

rustbot commented Feb 19, 2024

This comment has been minimized.

This comment has been minimized.

Noratrieb commented Feb 19, 2024

This comment has been minimized.

bors commented Feb 19, 2024

bors commented Feb 19, 2024

This comment has been minimized.

rust-timer commented Feb 20, 2024

c410-f3r commented Feb 20, 2024

cynecx Feb 20, 2024

[Experiment] Eliminate possible Vec::push branches #121300

[Experiment] Eliminate possible Vec::push branches #121300

Conversation

c410-f3r commented Feb 19, 2024

rustbot commented Feb 19, 2024

This comment has been minimized.

This comment has been minimized.

Noratrieb commented Feb 19, 2024

This comment has been minimized.

bors commented Feb 19, 2024

bors commented Feb 19, 2024

This comment has been minimized.

rust-timer commented Feb 20, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

Binary size

c410-f3r commented Feb 20, 2024

cynecx Feb 20, 2024

Choose a reason for hiding this comment

[Experiment] Eliminate possible `Vec::push` branches #121300

[Experiment] Eliminate possible `Vec::push` branches #121300