Don't monomorphize ty::tls closures in rustc_query_impl #106311

Noratrieb · 2022-12-30T22:57:45Z

alternative to #106270; closes #106270. helps with #65031

Best reviewed commit by commit.

r? @ghost

Noratrieb · 2022-12-30T22:58:54Z

@bors try @rust-timer queue

bors · 2022-12-30T22:59:02Z

⌛ Trying commit 473aea57e876dd638c63d3f4b13aa716ef020b74 with merge c082b356fe23b3b62ac02296de4de5d03b136702...

bors · 2022-12-31T01:39:23Z

☀️ Try build successful - checks-actions
Build commit: c082b356fe23b3b62ac02296de4de5d03b136702 (c082b356fe23b3b62ac02296de4de5d03b136702)

rust-timer · 2022-12-31T04:37:10Z

Finished benchmarking commit (c082b356fe23b3b62ac02296de4de5d03b136702): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.8%	[0.4%, 1.4%]	3
Improvements ✅ (primary)	-0.5%	[-0.5%, -0.5%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.5%	[-0.5%, -0.5%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.4%	[1.6%, 3.1%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

compiler/rustc_middle/src/ty/context/tls.rs

Noratrieb · 2022-12-31T13:30:25Z

r? @jyn514

Noratrieb · 2022-12-31T14:14:35Z

the perf regression is probably noise, these benches have been really noise recently

jyn514 · 2022-12-31T23:15:06Z

omg

omg omg

jyn514

This seems broadly good to me (7%!!) but I would like to get another set of eyes on the unsafe code. @cjgillot do the changes to rustc_query_impl::plumbing look right to you?

jyn514 · 2022-12-31T23:36:21Z

compiler/rustc_middle/src/ty/context/tls.rs

+/// Replaces the current context with `context`. Returns a drop guard that resets the context when it's dropped.
+///
+/// # Safety
+/// The caller has to ensure that the drop guard is actually run before the lifetime of `context` ends.
+#[inline]
+pub unsafe fn unsafe_set_scoped_context<'a, 'tcx>(
+    context: &'a ImplicitCtxt<'a, 'tcx>,
+) -> impl Sized {


Can we make this API safe by returning impl 'a + Sized, then returning (PhantomData, reset)?

Actually, that might allow us to make unsafe_get_context safe too:

type ResetTlvOnDrop = impl Sized; pub struct ImplicitCtxtToken<'a, 'tcx>(PhantomData<&'a (), &'tcx ()>, ResetTlvOnDrop); pub fn unsafe_set_scoped_context<'a, 'tcx>( context: &'a ImplicitCtxt<'a, 'tcx>, ) -> ImplicitCtxtToken<'a, 'tcx> { /* ... */ } pub fn unsafe_get_context<'a, 'tcx>(token: ImplicitCtxtToken<'a, 'tcx>) -> &'a ImplicitCtxt<'a, 'tcx> { /* ... */ }

Oh, I guess we hit the original reason we're using thread-locals, though, we can't pass the token through the query :/ that's very unfortunate

~~you could store the token in tls~~
Yeah, this style of API cannot be made safe given our restrictions. There's a reason why we used closures before :D

compiler/rustc_middle/src/ty/context/tls.rs

compiler/rustc_query_impl/src/plumbing.rs

compiler/rustc_middle/src/ty/context/tls.rs

jyn514 · 2023-01-01T00:12:54Z

It might also make sense to have enter_context take &mut dyn FnMut(_) instead; that should keep the perf benefits without needing unsafe. Do you have time to make a separate PR for that so we can compare the two?

Noratrieb · 2023-01-02T10:02:20Z

dyn FnMut won't work because the return type will still need to be generic, causing plenty of mono.

bors · 2023-03-02T04:20:49Z

⌛ Trying commit 254b7bf with merge a9ebee671300e96fc90fbf5158c16d20a84d61f8...

bors · 2023-03-02T07:13:17Z

☀️ Try build successful - checks-actions
Build commit: a9ebee671300e96fc90fbf5158c16d20a84d61f8 (a9ebee671300e96fc90fbf5158c16d20a84d61f8)

rust-timer · 2023-03-02T08:31:56Z

Finished benchmarking commit (a9ebee671300e96fc90fbf5158c16d20a84d61f8): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.3%]	13
Regressions ❌ (secondary)	0.2%	[0.2%, 0.3%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[0.2%, 0.3%]	13

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.1%	[-3.1%, -3.1%]	1
Improvements ✅ (secondary)	-2.1%	[-3.1%, -1.0%]	3
All ❌✅ (primary)	-3.1%	[-3.1%, -3.1%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.4%	[1.4%, 1.4%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.4%	[1.4%, 1.4%]	1

jyn514 · 2023-03-02T08:46:54Z

Looks like inline(always) lost the perf improvement. So #106311 (comment) is probably the best option here; it has a ~.3% regression on most crates, but speeds up bootstrap times by 7% for query_impl.

Zoxc · 2023-03-02T09:51:48Z

That probably needs to be re-evaluated post #108375.

Did you intend to only measure the inline(always) changes? That doesn't seem to relate much to the earlier changes. There's probably something going wrong with inlining for it to be a regression though.

This reverts commit 279063d.

This reverts commit bbed70c.

jyn514 · 2023-03-09T04:23:58Z

Did you intend to only measure the inline(always) changes? That doesn't seem to relate much to the earlier changes. There's probably something going wrong with inlining for it to be a regression though.

That was @cjgillot's suggestion in #106311 (comment).

That probably needs to be re-evaluated post #108375.

Ok, I reverted to the version that only makes the closures non-generic.

@bors try

bors · 2023-03-09T04:24:07Z

⌛ Testing commit 5827f79 with merge b9e3d8ff923e24da3ae6734611a88c0ae1531cbb...

bors · 2023-03-09T04:37:02Z

💔 Test failed - checks-actions

jyn514 · 2023-03-09T04:45:51Z

[2535/3025] Linking CXX static library lib\libLLVMDlltoolDriver.a
FAILED: lib/libLLVMDlltoolDriver.a 
cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E rm -f lib\libLLVMDlltoolDriver.a && D:\a\rust\rust\mingw64\bin\ar.exe qc lib\libLLVMDlltoolDriver.a  lib/ToolDrivers/llvm-dlltool/CMakeFiles/LLVMDlltoolDriver.dir/DlltoolDriver.cpp.obj && D:\a\rust\rust\mingw64\bin\ranlib.exe lib\libLLVMDlltoolDriver.a && cd ."
D:\a\rust\rust\mingw64\bin\ranlib.exe: could not create temporary file whilst writing archive: no more archived files

???

@bors try

bors · 2023-03-09T04:45:59Z

⌛ Trying commit 5827f79 with merge 4b296010e62183c6912c1a476363b71ed548976a...

bors · 2023-03-09T07:01:32Z

☀️ Try build successful - checks-actions
Build commit: 4b296010e62183c6912c1a476363b71ed548976a (4b296010e62183c6912c1a476363b71ed548976a)

rust-log-analyzer · 2023-03-10T07:16:17Z

The job dist-x86_64-mingw failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

[2533/3025] Building CXX object lib/XRay/CMakeFiles/LLVMXRay.dir/BlockIndexer.cpp.obj
[2534/3025] Linking CXX static library lib\libLLVMJITLink.a
[2535/3025] Linking CXX static library lib\libLLVMDlltoolDriver.a
FAILED: lib/libLLVMDlltoolDriver.a 
cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E rm -f lib\libLLVMDlltoolDriver.a && D:\a\rust\rust\mingw64\bin\ar.exe qc lib\libLLVMDlltoolDriver.a  lib/ToolDrivers/llvm-dlltool/CMakeFiles/LLVMDlltoolDriver.dir/DlltoolDriver.cpp.obj && D:\a\rust\rust\mingw64\bin\ranlib.exe lib\libLLVMDlltoolDriver.a && cd ."
D:\a\rust\rust\mingw64\bin\ranlib.exe: could not create temporary file whilst writing archive: no more archived files
[2536/3025] Linking CXX static library lib\libLLVMDebugInfoGSYM.a
[2537/3025] Linking CXX static library lib\libLLVMRuntimeDyld.a
[2538/3025] Linking CXX static library lib\libLLVMDebugInfoPDB.a
[2539/3025] Linking CXX static library lib\libLLVMObjectYAML.a
[2539/3025] Linking CXX static library lib\libLLVMObjectYAML.a
[2540/3025] Building CXX object lib/XRay/CMakeFiles/LLVMXRay.dir/Trace.cpp.obj
[2541/3025] Building CXX object lib/WindowsManifest/CMakeFiles/LLVMWindowsManifest.dir/WindowsManifestMerger.cpp.obj
[2542/3025] Building CXX object lib/WindowsDriver/CMakeFiles/LLVMWindowsDriver.dir/MSVCPaths.cpp.obj
ninja: build stopped: subcommand failed.
command did not execute successfully, got: exit code: 1


build script failed, must exit now', C:\Users\runneradmin\.cargo\registry\src\index.crates.io-6f17d22bba15001f\cmake-0.1.48\src\lib.rs:975:5
 finished in 241.530 seconds
Build completed unsuccessfully in 0:07:06

jyn514 · 2023-03-11T14:14:55Z

I'm not planning to follow up on this. The query system is changing quite rapidly lately and I don't have time to follow up on these PRs in a reasonable time before they get outdated by other changes.

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 30, 2022

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 31, 2022

Noratrieb force-pushed the no-tls-mono branch 3 times, most recently from 7154a80 to cf5ed23 Compare December 31, 2022 10:16

Noratrieb marked this pull request as ready for review December 31, 2022 10:17

fbstj reviewed Dec 31, 2022

View reviewed changes

compiler/rustc_middle/src/ty/context/tls.rs Outdated Show resolved Hide resolved

Noratrieb force-pushed the no-tls-mono branch from c934b1f to 6e28ff1 Compare December 31, 2022 10:49

rustbot assigned jyn514 Dec 31, 2022

jyn514 reviewed Jan 1, 2023

View reviewed changes

jyn514 assigned cjgillot and jyn514 and unassigned jyn514 Jan 1, 2023

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 2, 2023

bors mentioned this pull request Mar 2, 2023

Run compiler test suite in parallel on Fuchsia #108585

Merged

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 2, 2023

jyn514 added 5 commits March 9, 2023 04:14

try using a non-generic closure in start_query

afa569d

Revert "try using a non-generic closure in start_query"

390fd19

This reverts commit 279063d.

Try using inline(always) to avoid a runtime regression

bbed70c

combine both approaches

6322be3

Revert "Try using inline(always) to avoid a runtime regression"

5827f79

This reverts commit bbed70c.

jyn514 force-pushed the no-tls-mono branch from 254b7bf to 6322be3 Compare March 9, 2023 04:21

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 9, 2023

jyn514 mentioned this pull request Mar 9, 2023

[experiment] monomorphize fewer items in rustc_query_impl #108643

Closed

Noratrieb closed this Mar 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't monomorphize ty::tls closures in rustc_query_impl #106311

Don't monomorphize ty::tls closures in rustc_query_impl #106311

Noratrieb commented Dec 30, 2022 •

edited by jyn514

Loading

Noratrieb commented Dec 30, 2022

This comment has been minimized.

bors commented Dec 30, 2022

This comment has been minimized.

bors commented Dec 31, 2022

This comment has been minimized.

rust-timer commented Dec 31, 2022

Noratrieb commented Dec 31, 2022

Noratrieb commented Dec 31, 2022

jyn514 commented Dec 31, 2022

jyn514 left a comment

jyn514 Dec 31, 2022 •

edited

Loading

Noratrieb Jan 1, 2023

jyn514 commented Jan 1, 2023

Noratrieb commented Jan 2, 2023

bors commented Mar 2, 2023

bors commented Mar 2, 2023

This comment has been minimized.

rust-timer commented Mar 2, 2023

jyn514 commented Mar 2, 2023

Zoxc commented Mar 2, 2023

jyn514 commented Mar 9, 2023

bors commented Mar 9, 2023

bors commented Mar 9, 2023

jyn514 commented Mar 9, 2023

bors commented Mar 9, 2023

bors commented Mar 9, 2023

rust-log-analyzer commented Mar 10, 2023

jyn514 commented Mar 11, 2023

Don't monomorphize ty::tls closures in rustc_query_impl #106311

Don't monomorphize ty::tls closures in rustc_query_impl #106311

Conversation

Noratrieb commented Dec 30, 2022 • edited by jyn514 Loading

Noratrieb commented Dec 30, 2022

This comment has been minimized.

bors commented Dec 30, 2022

This comment has been minimized.

bors commented Dec 31, 2022

This comment has been minimized.

rust-timer commented Dec 31, 2022

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

Noratrieb commented Dec 31, 2022

Noratrieb commented Dec 31, 2022

jyn514 commented Dec 31, 2022

jyn514 left a comment

Choose a reason for hiding this comment

jyn514 Dec 31, 2022 • edited Loading

Choose a reason for hiding this comment

Noratrieb Jan 1, 2023

Choose a reason for hiding this comment

jyn514 commented Jan 1, 2023

Noratrieb commented Jan 2, 2023

bors commented Mar 2, 2023

bors commented Mar 2, 2023

This comment has been minimized.

rust-timer commented Mar 2, 2023

Overall result: ❌ regressions - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

jyn514 commented Mar 2, 2023

Zoxc commented Mar 2, 2023

jyn514 commented Mar 9, 2023

bors commented Mar 9, 2023

bors commented Mar 9, 2023

jyn514 commented Mar 9, 2023

bors commented Mar 9, 2023

bors commented Mar 9, 2023

rust-log-analyzer commented Mar 10, 2023

jyn514 commented Mar 11, 2023

Noratrieb commented Dec 30, 2022 •

edited by jyn514

Loading

jyn514 Dec 31, 2022 •

edited

Loading