Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid inlining thread_local!'s __getit function on Windows #101368

Merged
merged 1 commit into from
Nov 23, 2022

Conversation

thomcc
Copy link
Member

@thomcc thomcc commented Sep 3, 2022

Sadly, this will make things slower to avoid UB in an edge case, but it seems hard to avoid... and really whenever I look at this code I can't help but think we're asking for trouble.

It's pretty dodgy for us to leave this as a normal function rather than #[inline(never)], given that if it does get inlined into a dynamically linked component, it's extremely unsafe (you get some other thread local, or if you're lucky, crash). Given that it's pretty rare for people to use dylibs on Windows, the fact that we haven't gotten bug reports about it isn't really that convincing. Ideally we'd come up with some kind of compiler solution (that avoids paying for this cost when static linking, or at least for use within the same crate...), but it's not clear what that looks like.

Oh, and because all this is only needed when we're implementing thread_local! with #[thread_local], this patch adjusts the cfg_attr to be all(windows, target_thread_local) as well.

r? @ChrisDenton

See also #84933, which is about improving the situation.

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 3, 2022
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 3, 2022
@rustbot rustbot added the T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. label Sep 3, 2022
@nikic
Copy link
Contributor

nikic commented Sep 3, 2022

Just to clarify my understanding of the issue here: #[inline] has two effects. One is that it makes rustc materialize the function when used in other crates. And the second one is to control whether inlining is legal/recommended on the LLVM side. Do I understand correctly that the issue here is purely with the first part? That is, it would be fine to make this always #[inline], so long as it does not get materialized across crate-boundaries? I assume this is also why this has worked fine with just the absence of #[inline] rather than #[inline(never)], as inlining itself is not the issue here.

@nikic
Copy link
Contributor

nikic commented Sep 3, 2022

Or in other words, if we make generates_cgu_internal_copy() return false for function that reference #[thread_local] on Windows, would that fix the issue even if #[inline] is present?

@ChrisDenton
Copy link
Member

Just to be clear, it's not crate boundaries that are the issue. It's dll/exe boundaries. Ideally it'd be great if it could inline across crate boundaries if that doesn't also cause it to jump across dll boundaries.

@ChrisDenton ChrisDenton removed the T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. label Sep 3, 2022
@thomcc
Copy link
Member Author

thomcc commented Sep 3, 2022

@nikic The issue is that on Windows, if the instantiation of the #[thread_local] ends up in a different DLL from a function that references it, then it's UB (and will crash or worse).

This happens in practice if we have #[inline], so we don't use #[inline] on that function on Windows. The issue is that the compiler is completely allowed to inline functions even if they don't have #[inline] on them.

My understanding is that it's completely possible for this to happen even across a crate_type = "dylib" boundary, and preventing such would be difficult to do in practice in a way that correctly handles transitive inlining and such (for example, even inlining __getit into functions in the same crate could cause problems if those functions are #[inline] or happen to get inlined anyway).

That said, I don't know that anybody has seen it happen. @ChrisDenton indicated that they may have seen it once, but weren't sure. I feel like there are enough configuration options here (including things like -Zshare-generics, which seem like they'd make it more likely that something ends up inlined across a crate boundary) that it feels like the only reason we might not have seen it is that crate_type = "dylib"s are rarely used on Window.

However, if you know a reason that I'm wrong and the inlining can't happen without #[inline], or a better way to prevent this, then I'd be thrilled!

@ChrisDenton
Copy link
Member

I'm going to cc a few people just in case they want to comment on the compiler/dylibs situation with Windows #[thread_local]. Not totally sure who the most relevant people are here so sorry for the ping if it's not you.

cc @bjorn3 @eddyb @michaelwoerister

@thomcc
Copy link
Member Author

thomcc commented Sep 3, 2022

Sounds good, there's no rush here, it's mostly a PR made out of caution, and it would be way better to solve this in a better way (it just seems like if we're going to rely on it not being inlined for correctness we shouldn't mark it as something which is allowed to be inlined).

@eddyb
Copy link
Member

eddyb commented Sep 6, 2022

If the fix is from going between "no #[inline] of any kind" and #[inline(never)], I don't really understand how that happens.

Is it "per-CGU copies" while building a single crate (and was that what @nikic was referencing above)? Is it from LTO?

Normally a non-(type/const)-generic function doesn't get cross-crate codegen without an #[inline] attribute and it sounds like LTO wasn't involved, so it has to be the "per-CGU copy" thing?

If that is the case, it's probably safe to claim that people working on these parts of the stdlib weren't aware that such duplication existed without an #[inline] opt-in, right?

@Subsentient
Copy link

Subsentient commented Nov 10, 2022

I just hit this issue on x86_64 Windows on nightly. Please merge this fix, it would be helpful. Thanks!

@thomcc
Copy link
Member Author

thomcc commented Nov 10, 2022

@eddyb I think that's right, yes.

@ChrisDenton
Copy link
Member

@Subsentient would you be able to give more details? It would be super helpful if you're able. Also did you test if this PR does address the problem you're seeing? No worries if not but it'd be good to confirm.

Btw, I am minded to merge this as an interim fix but I would really love for the underlying problem to be addressed somehow.

@Subsentient
Copy link

Subsentient commented Nov 10, 2022

@Subsentient would you be able to give more details? It would be super helpful if you're able. Also did you test if this PR does address the problem you're seeing? No worries if not but it'd be good to confirm.

Btw, I am minded to merge this as an interim fix but I would really love for the underlying problem to be addressed somehow.

So I'm hitting this in a performance library specific to my company. It is happening in __getit().
It's compiled as both a static rlib for the main executable, and as a cdylib for use in C and C++-based dependency DLLs, as it exposes a C API. Both are frequently loaded at once in the same process.

I can't post source, it's legally not my code, it belongs to the company, but I can post a screenshot of a backtrace.
image

Yes, I would appreciate a merge. It's obviously not a great fix, but if it fixes the functionality, that will be very useful.
Thanks!

@ChrisDenton
Copy link
Member

Thank you! As I say, I do think this needs investigating more but this PR should fix it while that happens.

@bors r+

@bors
Copy link
Contributor

bors commented Nov 10, 2022

📌 Commit 4062925df2350e8c51eb59f5d6dd157d38f79274 has been approved by ChrisDenton

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 10, 2022
@klensy
Copy link
Contributor

klensy commented Nov 10, 2022

Revert last commit?

@ChrisDenton
Copy link
Member

Oh oops.

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 10, 2022
@thomcc
Copy link
Member Author

thomcc commented Nov 22, 2022

Crap, totally forgot to revert that.

@rustbot
Copy link
Collaborator

rustbot commented Nov 22, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@thomcc
Copy link
Member Author

thomcc commented Nov 22, 2022

@bors r=ChrisDenton

@bors
Copy link
Contributor

bors commented Nov 22, 2022

📌 Commit 3099dfd has been approved by ChrisDenton

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 22, 2022
Manishearth added a commit to Manishearth/rust that referenced this pull request Nov 22, 2022
Forbid inlining `thread_local!`'s `__getit` function on Windows

Sadly, this will make things slower to avoid UB in an edge case, but it seems hard to avoid... and really whenever I look at this code I can't help but think we're asking for trouble.

It's pretty dodgy for us to leave this as a normal function rather than `#[inline(never)]`, given that if it *does* get inlined into a dynamically linked component, it's extremely unsafe (you get some other thread local, or if you're lucky, crash). Given that it's pretty rare for people to use dylibs on Windows, the fact that we haven't gotten bug reports about it isn't really that convincing. Ideally we'd come up with some kind of compiler solution (that avoids paying for this cost when static linking, or *at least* for use within the same crate...), but it's not clear what that looks like.

Oh, and because all this is only needed when we're implementing `thread_local!` with `#[thread_local]`, this patch adjusts the `cfg_attr` to be `all(windows, target_thread_local)` as well.

r? `@ChrisDenton`

See also rust-lang#84933, which is about improving the situation.
Manishearth added a commit to Manishearth/rust that referenced this pull request Nov 22, 2022
Forbid inlining `thread_local!`'s `__getit` function on Windows

Sadly, this will make things slower to avoid UB in an edge case, but it seems hard to avoid... and really whenever I look at this code I can't help but think we're asking for trouble.

It's pretty dodgy for us to leave this as a normal function rather than `#[inline(never)]`, given that if it *does* get inlined into a dynamically linked component, it's extremely unsafe (you get some other thread local, or if you're lucky, crash). Given that it's pretty rare for people to use dylibs on Windows, the fact that we haven't gotten bug reports about it isn't really that convincing. Ideally we'd come up with some kind of compiler solution (that avoids paying for this cost when static linking, or *at least* for use within the same crate...), but it's not clear what that looks like.

Oh, and because all this is only needed when we're implementing `thread_local!` with `#[thread_local]`, this patch adjusts the `cfg_attr` to be `all(windows, target_thread_local)` as well.

r? ```@ChrisDenton```

See also rust-lang#84933, which is about improving the situation.
bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 22, 2022
Rollup of 7 pull requests

Successful merges:

 - rust-lang#101368 (Forbid inlining `thread_local!`'s `__getit` function on Windows)
 - rust-lang#102293 (Add powerpc64-ibm-aix as Tier-3 target)
 - rust-lang#104717 (Add failing test for projections used as const generic)
 - rust-lang#104720 (rustdoc: remove no-op CSS `.popover::before / a.test-arrow { display: inline-block }`)
 - rust-lang#104722 (Speed up mpsc_stress test)
 - rust-lang#104724 (Fix `ClosureKind::to_def_id`)
 - rust-lang#104728 (Use `tcx.require_lang_item` instead of unwrapping lang items)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 2f506e6 into rust-lang:master Nov 23, 2022
@rustbot rustbot added this to the 1.67.0 milestone Nov 23, 2022
@thomcc thomcc deleted the wintls-noinline branch November 23, 2022 03:18
@ehuss
Copy link
Contributor

ehuss commented Nov 25, 2022

@thomcc This PR seems to have caused a regression on x86_64-pc-windows-gnu (see #104852), which is affecting Cargo's testsuite.

@thomcc
Copy link
Member Author

thomcc commented Nov 25, 2022

Wow, that's pretty spooky since it definitely shouldn't have an impact like that. That's definitely a compiler bug, but I'll put up a revert anyway...

bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 25, 2022
…isDenton

Revert "Forbid inlining `thread_local!`'s `__getit` function on Windows"

Revert of rust-lang#101368, fixes rust-lang#104852.

I'd rather not do this since that's a soundness fix and this is hitting some compiler bug, but I don't really know an alternative.

r? `@ChrisDenton`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants