-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove last rt::init allocation for thread info #123550
base: master
Are you sure you want to change the base?
Conversation
r? @Nilstrieb rustbot has assigned @Nilstrieb. Use |
I just checked and |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
d5a081b
to
d5b8b00
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Remove last rt::init allocation for thread info Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread. - The thread name can just be a hard coded literal, as was done in rust-lang#123433. - The ThreadId is always the `1` value, so `ThreadId::new` now starts at `2` and can fabricate the `1` value when needed. - Storing Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic. This currently does not have a regression test to prevent future changes from re-adding allocations pre-main as I'm [having trouble](GnomedDev@6f7be53) implementing it, but if wanted I can draft this PR until that test is ready.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (666bbff): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 666.761s -> 666.789s (0.00%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the increased complexity and unsafety.. if you have some good justification for why this is important that would be great, but I'm inclined to accept it even without that, it certainly feels good to have this property.
The increased complexity is a bit sad, but this is already a complex and unsafe process to initialise the basics for the runtime, so I felt that the increased performance and decreased compile time was worth a small amount of well documented unsafety. |
Hmm, looking at the actual perf run, it seems quite negative which is certainly unexpected. How is this commonly debugged, as I don't want to go off vibes? |
Run the cachegrind command to see where in the compiler the diff occurs. Though FWIW, I would expect these results to be noise and wouldn't chase them further myself - I'd just treat it as "makes no difference". |
so yeah, no real decreased compile time. as for increased performance, I doubt that this will be measurable, maybe |
Okay, I don't have a benchmark (I never have a benchmark). Would you like me to rewrite this using OnceLock, just to see if that perf run is also neutral? |
778330b
to
ab8eba1
Compare
Sorted the existing review comments, just waiting on a reply to my last comment. |
☔ The latest upstream changes (presumably #123913) made this pull request unmergeable. Please resolve the merge conflicts. |
ab8eba1
to
2c45b39
Compare
Okay, @Nilstrieb I've been trying for the last week different ways to make this less unsafe and complex but it doesn't seem possible with the "Parker must be initialized in place" requirement. I cannot initialize a OnceLock or an Option in-place without increasing complexity significantly, so this seems like the least complex (and most performant) way to do this. |
running the try job to confirm the test failure is still a problem @bors try |
Remove last rt::init allocation for thread info Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread. - The thread name can just be a hard coded literal, as was done in rust-lang#123433. - The ThreadId is always the `1` value, so `ThreadId::new` now starts at `2` and can fabricate the `1` value when needed. - Storing Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic. This also adds a UI test to make sure that allocations do not occur before main ever again. try-job: dist-x86_64-linux
☀️ Try build successful - checks-actions |
...? Uh, that worked? Let's try this for real. |
…trieb Remove last rt::init allocation for thread info Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread. - The thread name can just be a hard coded literal, as was done in rust-lang#123433. - The ThreadId is always the `1` value, so `ThreadId::new` now starts at `2` and can fabricate the `1` value when needed. - Storing Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic. This also adds a UI test to make sure that allocations do not occur before main ever again. try-job: dist-x86_64-linux
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
must not run the relevant test suite? |
Yep, this passed try builds months ago, although this looks like a fixable test failure. |
@rustbot author ? |
☔ The latest upstream changes (presumably #127924) made this pull request unmergeable. Please resolve the merge conflicts. |
🔒 Merge conflict This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again. How do I rebase?Assuming
You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial. Please avoid the "Resolve conflicts" button on GitHub. It uses Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Error message
|
@bors r- This got back into the queue after the resync |
@GnomedDev any updates on this? |
Co-authored-by: Nilstrieb <48135649+Nilstrieb@users.noreply.github.com>
a8256af
to
a240dd8
Compare
@Dylan-DPC This PR is currently blocked on the test failure, as the no-allocation-before-main test added by this PR is very target-specific and seems to be failing due to a lack of I have suggested removing the test before, but from my memory we were not comfortable merging this without it, therefore it is just blocked until someone can figure out how to target the test even more specifically to a target with |
☔ The latest upstream changes (presumably #130534) made this pull request unmergeable. Please resolve the merge conflicts. |
Removes the last allocation pre-main by just not storing anything in std::thread::Thread for the main thread.
1
value, soThreadId::new
now starts at2
and can fabricate the1
value when needed.This also adds a UI test to make sure that allocations do not occur before main ever again.
try-job: dist-x86_64-linux