Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass fmt::Arguments by reference to PanicInfo and PanicMessage #129491

Merged
merged 1 commit into from
Sep 18, 2024

Conversation

StackOverflowExcept1on
Copy link
Contributor

@StackOverflowExcept1on StackOverflowExcept1on commented Aug 23, 2024

Resolves #129330

For some reason after #115974 and #126732 optimizations applied to panic handler became worse and compiler stopped removing panic locations if they are not used in the panic message. This PR fixes that and maybe we can merge it into beta before rust 1.81 is released.

Note: optimization only works with lto = "fat".

r? libs-api

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 23, 2024
@StackOverflowExcept1on
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Aug 23, 2024

@StackOverflowExcept1on: 🔑 Insufficient privileges: not in try users

@workingjubilee
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 24, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 24, 2024
Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage`

Resolves rust-lang#129330

For some reason after rust-lang#115974 and rust-lang#126732 optimizations applied to panic handler became worse and compiler stopped removing panic locations if they are not used in the panic message. This PR fixes that and maybe we can merge it into beta before rust 1.81 is released.

Note: optimization only works with `lto = "fat"`.

r? libs-api
@bors
Copy link
Contributor

bors commented Aug 24, 2024

⌛ Trying commit c2fdb34 with merge 4d6f35a...

@bors
Copy link
Contributor

bors commented Aug 24, 2024

☀️ Try build successful - checks-actions
Build commit: 4d6f35a (4d6f35af7e092aa393d8058d662ae8613385e3b9)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4d6f35a): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (secondary 2.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.3% [2.3%, 2.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.0% [-0.0%, -0.0%] 1
All ❌✅ (primary) - - 0

Bootstrap: 751.449s -> 751.215s (-0.03%)
Artifact size: 339.00 MiB -> 338.95 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 24, 2024
@StackOverflowExcept1on
Copy link
Contributor Author

StackOverflowExcept1on commented Aug 24, 2024

I don't know why bot rejected PR. Maybe it needs some special case like crate-type = (c)dylib with lto = "fat" and custom panic handler, but as we can see the binary size of librustc_driver.so has decreased (-99.06 KiB, -0.074%).

@workingjubilee
Copy link
Member

workingjubilee commented Aug 24, 2024

That isn't really strong evidence, it's not uncommon for things that shouldn't really affect size much in a release-compiled binary to nonetheless introduce a variance of about 100KB, like here: #129498

@saethlin
Copy link
Member

I'm shocked that this produces the desired optimization. Usually references get significantly worse optimizations than value because the optimizer has to reason about aliasing, so the fact that adding references makes things optimize better seems extremely fragile.

@StackOverflowExcept1on
Copy link
Contributor Author

StackOverflowExcept1on commented Aug 25, 2024

so the fact that adding references makes things optimize better seems extremely fragile.

@saethlin I tried to use panic_info_message feature with nightly rust, which was released 2 years ago (it uses references) and even on older versions it worked fine. Maybe the reason is size_of::<fmt::Arguments>() = 48, and also it is a Copy type?

Although it's still weird because PanicInfo<'a> has 'a lifetime, and Location::caller() returns &'static Location<'static>, i.e. the compiler will probably infer the type PanicInfo<'static> and still not be able to optimize the code.

@Amanieu
Copy link
Member

Amanieu commented Aug 26, 2024

r? @m-ou-se

@rustbot rustbot assigned m-ou-se and unassigned Amanieu Aug 26, 2024
@tgross35
Copy link
Contributor

Is the impact inflated code size? If so, how much is the change?

If it is not negligible (or even if it is), an rmake test would probably be good to make sure it doesn't regress without us noticing. Especially considering this change is all that is required to get it pruned, seems a bit fragile.

@StackOverflowExcept1on
Copy link
Contributor Author

StackOverflowExcept1on commented Aug 27, 2024

@tgross35 How to run rmake test?

Is the impact inflated code size? If so, how much is the change?

See an example of changing the binary size for librustc_driver.so: #129491 (comment). I think it depends on length of directory where cargo build is run and number of places with panic!().

I also opened the example from #129330 in the decompiler with and without this PR change. In addition to removing location panics, the compiler generates less code.

Before:
void __fastcall rust_begin_unwind(__int128 *a1, __int64 a2, __int64 a3, __int64 a4, __int64 a5, __int64 a6)
{
  __int128 v6; // xmm0
  __int128 v7; // xmm1
  _QWORD v8[2]; // [rsp+0h] [rbp-478h] BYREF
  _OWORD v9[3]; // [rsp+10h] [rbp-468h] BYREF
  _QWORD v10[6]; // [rsp+40h] [rbp-438h] BYREF
  unsigned int v11; // [rsp+74h] [rbp-404h] BYREF
  _BYTE v12[1024]; // [rsp+78h] [rbp-400h] BYREF

  v6 = *a1;
  v7 = a1[1];
  v9[2] = a1[2];
  v9[1] = v7;
  v9[0] = v6;
  v11 = 0;
  v8[0] = v9;
  v10[0] = &off_2CC8;
  v10[1] = 2LL;
  v10[4] = 0LL;
  v8[1] = <core::fmt::Arguments as core::fmt::Display>::fmt;
  v10[2] = v8;
  v10[3] = 1LL;
  (core::fmt::Write::write_fmt)(&v11, v10, a3, v8, a5, a6);
  gr_panic(v12, v11);
  JUMPOUT(0x1C32LL);
}
After:
void __fastcall rust_begin_unwind(__int64 a1)
{
  __int64 v1; // [rsp+8h] [rbp-450h] BYREF
  _QWORD v2[2]; // [rsp+10h] [rbp-448h] BYREF
  _QWORD v3[6]; // [rsp+20h] [rbp-438h] BYREF
  unsigned int v4; // [rsp+54h] [rbp-404h] BYREF
  _BYTE v5[1024]; // [rsp+58h] [rbp-400h] BYREF

  v1 = a1;
  v4 = 0;
  v2[0] = &v1;
  v3[0] = &off_2C10;
  v3[1] = 2LL;
  v3[4] = 0LL;
  v2[1] = <&T as core::fmt::Display>::fmt;
  v3[2] = v2;
  v3[3] = 1LL;
  core::fmt::Write::write_fmt((__int64)&v4, v3);
  gr_panic(v5, v4);
  JUMPOUT(0x1B91LL);
}

Especially considering this change is all that is required to get it pruned, seems a bit fragile.

However, this change has worked for the last few years (only tested nightly rust 2022).

@workingjubilee
Copy link
Member

Please just show me the assembly.

@workingjubilee
Copy link
Member

workingjubilee commented Aug 28, 2024

Pointing to a decompiler's output is actively misleading because it reconstructs the program's information into a language that does not have a 1-1 match to the actual instruction set. So a single line of decompiled C can be 8 assembly instructions, or 8 lines of C can be emitted from a single assembly instruction.

We can at least reconstruct a partially-accurate binary size from assembly.

@StackOverflowExcept1on
Copy link
Contributor Author

StackOverflowExcept1on commented Aug 28, 2024

@workingjubilee I tried to emulate something similar on godbolt and it didn't work. I think it makes sense to just attach 2 ELF files (rename it to .so). Use objdump or some other tools like IDA Pro. I also attach .c pseudocode and .asm.

Source: https://github.com/StackOverflowExcept1on/rust-regression/blob/master/program/src/lib.rs

// c2fdb3435fe
__int64 init()
{
  core::panicking::panic_fmt();
  return rust_begin_unwind();
}

// eef00c8be8d
__int64 init()
{
  __int128 v1; // [rsp+8h] [rbp-30h] BYREF
  __int64 v2; // [rsp+18h] [rbp-20h]
  __int128 v3; // [rsp+20h] [rbp-18h]

  *(_QWORD *)&v1 = &off_2BE0;
  *((_QWORD *)&v1 + 1) = 1LL;
  v2 = 8LL;
  v3 = 0LL;
  core::panicking::panic_fmt();
  return rust_begin_unwind(&v1);
}

From the pseudocode of the init function you can see that the compiler probably uses comptime instead of runtime.

super::intrinsics::const_eval_select((fmt, force_no_backtrace), comptime, runtime);

@joshtriplett joshtriplett added the I-libs-nominated Nominated for discussion during a libs team meeting. label Aug 28, 2024
@Amanieu Amanieu removed the I-libs-nominated Nominated for discussion during a libs team meeting. label Sep 4, 2024
@m-ou-se
Copy link
Member

m-ou-se commented Sep 5, 2024

Do you have any idea why taking message by reference would allow for optimizing away unused locations?

@StackOverflowExcept1on
Copy link
Contributor Author

@m-ou-se I don't know, maybe because of const_eval_select. It might also make sense to compile the same binary with and without lto to see what effect it has.

@StackOverflowExcept1on
Copy link
Contributor Author

StackOverflowExcept1on commented Sep 5, 2024

@m-ou-se my best guess why this happens is that panic_impl is marked as extern "Rust" and the call site of this function is probably less optimized. For example, why did the compiler remove panic locations, but not by default, but only with lto="fat"?. This looks like some optimization issue near FFI boundaries.

When using only references to create PanicInfo, the compiler can probably make more assumptions about the use of panic locations by the panic_impl implementation.

// NOTE This function never crosses the FFI boundary; it's a Rust-to-Rust call
// that gets resolved to the `#[panic_handler]` function.
extern "Rust" {
#[lang = "panic_impl"]
fn panic_impl(pi: &PanicInfo<'_>) -> !;
}
let pi = PanicInfo::new(
fmt,
Location::caller(),
/* can_unwind */ true,
/* force_no_backtrace */ false,
);
// SAFETY: `panic_impl` is defined in safe Rust code and thus is safe to call.
unsafe { panic_impl(&pi) }

@m-ou-se
Copy link
Member

m-ou-se commented Sep 18, 2024

It seems unlikely this will make anything worse, so let's merge this for now.

Once we reduce the size of fmt::Arguments, we should re-evaulate all the references to fmt::Arguments whether they should be by value instead.

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Sep 18, 2024

📌 Commit c2fdb34 has been approved by m-ou-se

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 18, 2024
@bors
Copy link
Contributor

bors commented Sep 18, 2024

⌛ Testing commit c2fdb34 with merge aaed38b...

@bors
Copy link
Contributor

bors commented Sep 18, 2024

☀️ Test successful - checks-actions
Approved by: m-ou-se
Pushing aaed38b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 18, 2024
@bors bors merged commit aaed38b into rust-lang:master Sep 18, 2024
7 checks passed
@rustbot rustbot added this to the 1.83.0 milestone Sep 18, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (aaed38b): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary -1.7%, secondary -3.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.7% [-1.8%, -1.6%] 2
Improvements ✅
(secondary)
-3.1% [-3.1%, -3.1%] 1
All ❌✅ (primary) -1.7% [-1.8%, -1.6%] 2

Cycles

Results (secondary 5.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
5.9% [5.4%, 6.7%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 765.544s -> 765.7s (0.02%)
Artifact size: 341.31 MiB -> 341.29 MiB (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Сompiler can't remove panic locations if they are not used in panic handler
10 participants