Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove last rt::init allocation for thread info #123550

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

GnomedDev
Copy link
Contributor

@GnomedDev GnomedDev commented Apr 6, 2024

Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread.

  • The thread name can just be a hard coded literal, as was done in Remove rt::init allocation for thread name #123433.
  • The ThreadId is always the 1 value, so ThreadId::new now starts at 2 and can fabricate the 1 value when needed.
  • Storing Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic.

This also adds a UI test to make sure that allocations do not occur before main ever again.

try-job: dist-x86_64-linux

@rustbot
Copy link
Collaborator

rustbot commented Apr 6, 2024

r? @Nilstrieb

rustbot has assigned @Nilstrieb.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 6, 2024
@GnomedDev
Copy link
Contributor Author

I just checked and Option<Pin<Arc<T>>> does indeed niche, so this doesn't grow the size of std::thread::Thread.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@saethlin
Copy link
Member

saethlin commented Apr 6, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 6, 2024
@bors
Copy link
Contributor

bors commented Apr 6, 2024

⌛ Trying commit d5b8b00 with merge 666bbff...

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 6, 2024
Remove last rt::init allocation for thread info

Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread.
- The thread name can just be a hard coded literal, as was done in rust-lang#123433.
- The ThreadId is always the `1` value, so `ThreadId::new` now starts at `2` and can fabricate the `1` value when needed.
- Storing Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic.

This currently does not have a regression test to prevent future changes from re-adding allocations pre-main as I'm [having trouble](GnomedDev@6f7be53) implementing it, but if wanted I can draft this PR until that test is ready.
@bors
Copy link
Contributor

bors commented Apr 6, 2024

☀️ Try build successful - checks-actions
Build commit: 666bbff (666bbff29cc26856cc869d4b7e16f6843b105c4b)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (666bbff): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.4%, 0.4%] 1
Regressions ❌
(secondary)
1.5% [1.5%, 1.5%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.4%, 0.4%] 1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.0% [2.6%, 5.2%] 3
Regressions ❌
(secondary)
4.3% [2.9%, 6.7%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.0% [2.6%, 5.2%] 3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.8% [1.4%, 2.2%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.1% [-2.1%, -2.1%] 1
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.1%] 23
Regressions ❌
(secondary)
0.1% [0.0%, 0.1%] 35
Improvements ✅
(primary)
-0.1% [-0.3%, -0.0%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.3%, 0.1%] 25

Bootstrap: 666.761s -> 666.789s (0.00%)
Artifact size: 318.27 MiB -> 318.23 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 6, 2024
Copy link
Member

@Noratrieb Noratrieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the increased complexity and unsafety.. if you have some good justification for why this is important that would be great, but I'm inclined to accept it even without that, it certainly feels good to have this property.

tests/ui/runtime/no-allocation-before-main.rs Outdated Show resolved Hide resolved
tests/rustdoc/demo-allocator-54478.rs Outdated Show resolved Hide resolved
library/std/src/thread/mod.rs Outdated Show resolved Hide resolved
@GnomedDev
Copy link
Contributor Author

The increased complexity is a bit sad, but this is already a complex and unsafe process to initialise the basics for the runtime, so I felt that the increased performance and decreased compile time was worth a small amount of well documented unsafety.

@GnomedDev
Copy link
Contributor Author

Hmm, looking at the actual perf run, it seems quite negative which is certainly unexpected. How is this commonly debugged, as I don't want to go off vibes?

@Noratrieb
Copy link
Member

Run the cachegrind command to see where in the compiler the diff occurs. Though FWIW, I would expect these results to be noise and wouldn't chase them further myself - I'd just treat it as "makes no difference".

@Noratrieb
Copy link
Member

Noratrieb commented Apr 6, 2024

that the increased performance and decreased compile time

so yeah, no real decreased compile time. as for increased performance, I doubt that this will be measurable, maybe fn main() {} (which is a pretty useless program). If you have a benchmark where this helps that would be great to have.

@GnomedDev
Copy link
Contributor Author

Okay, I don't have a benchmark (I never have a benchmark). Would you like me to rewrite this using OnceLock, just to see if that perf run is also neutral?

@GnomedDev GnomedDev force-pushed the remove-initial-arc branch 2 times, most recently from 778330b to ab8eba1 Compare April 7, 2024 11:17
@GnomedDev
Copy link
Contributor Author

Sorted the existing review comments, just waiting on a reply to my last comment.

@bors
Copy link
Contributor

bors commented Apr 14, 2024

☔ The latest upstream changes (presumably #123913) made this pull request unmergeable. Please resolve the merge conflicts.

@GnomedDev
Copy link
Contributor Author

Okay, @Nilstrieb I've been trying for the last week different ways to make this less unsafe and complex but it doesn't seem possible with the "Parker must be initialized in place" requirement. I cannot initialize a OnceLock or an Option in-place without increasing complexity significantly, so this seems like the least complex (and most performant) way to do this.

@workingjubilee
Copy link
Member

running the try job to confirm the test failure is still a problem

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 3, 2024
Remove last rt::init allocation for thread info

Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread.
- The thread name can just be a hard coded literal, as was done in rust-lang#123433.
- The ThreadId is always the `1` value, so `ThreadId::new` now starts at `2` and can fabricate the `1` value when needed.
- Storing Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic.

This also adds a UI test to make sure that allocations do not occur before main ever again.

try-job: dist-x86_64-linux
@bors
Copy link
Contributor

bors commented Jul 3, 2024

⌛ Trying commit a8256af with merge 0e7868d...

@bors
Copy link
Contributor

bors commented Jul 3, 2024

☀️ Try build successful - checks-actions
Build commit: 0e7868d (0e7868da82fdc6a39e646133a5eb0279541ee1be)

@workingjubilee
Copy link
Member

...? Uh, that worked? Let's try this for real.

@bors r=@Nilstrieb

@bors
Copy link
Contributor

bors commented Jul 5, 2024

📌 Commit a8256af has been approved by Nilstrieb

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 5, 2024
@bors
Copy link
Contributor

bors commented Jul 5, 2024

⌛ Testing commit a8256af with merge 17114c0...

bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 5, 2024
…trieb

Remove last rt::init allocation for thread info

Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread.
- The thread name can just be a hard coded literal, as was done in rust-lang#123433.
- The ThreadId is always the `1` value, so `ThreadId::new` now starts at `2` and can fabricate the `1` value when needed.
- Storing Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic.

This also adds a UI test to make sure that allocations do not occur before main ever again.

try-job: dist-x86_64-linux
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Jul 5, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 5, 2024
@workingjubilee
Copy link
Member

workingjubilee commented Jul 5, 2024

must not run the relevant test suite?

@GnomedDev
Copy link
Contributor Author

Yep, this passed try builds months ago, although this looks like a fixable test failure.

@Noratrieb
Copy link
Member

@rustbot author ?

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 14, 2024
@bors
Copy link
Contributor

bors commented Jul 18, 2024

☔ The latest upstream changes (presumably #127924) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Jul 27, 2024

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

  1. git checkout remove-initial-arc (switch to your branch)
  2. git fetch upstream master (retrieve the latest master)
  3. git rebase upstream/master -p (rebase on top of it)
  4. Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
  5. git push self remove-initial-arc --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message
Auto-merging library/std/src/thread/mod.rs
CONFLICT (content): Merge conflict in library/std/src/thread/mod.rs
Auto-merging library/std/src/lib.rs
Automatic merge failed; fix conflicts and then commit the result.

@tgross35
Copy link
Contributor

tgross35 commented Aug 8, 2024

@bors r-

This got back into the queue after the resync

@Dylan-DPC
Copy link
Member

@GnomedDev any updates on this?

@GnomedDev
Copy link
Contributor Author

@Dylan-DPC This PR is currently blocked on the test failure, as the no-allocation-before-main test added by this PR is very target-specific and seems to be failing due to a lack of __cxa_thread_atexit_impl causing the thread local setup to allocate before main.

I have suggested removing the test before, but from my memory we were not comfortable merging this without it, therefore it is just blocked until someone can figure out how to target the test even more specifically to a target with __cxa_thread_atexit_impl.

@bors
Copy link
Contributor

bors commented Sep 19, 2024

☔ The latest upstream changes (presumably #130534) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.