Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trans: Avoid using weak_odr linkage for closures. #34793

Closed
michaelwoerister opened this issue Jul 12, 2016 · 7 comments
Closed

trans: Avoid using weak_odr linkage for closures. #34793

michaelwoerister opened this issue Jul 12, 2016 · 7 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries

Comments

@michaelwoerister
Copy link
Member

At the moment, all closures get weak_odr linkage, since that allows for easily avoiding symbol conflicts between copies of the same closure in different crates -- something that can happen when a function containing a closure is inlined into another crate.
One problem though with this approach is that the mingw toolchain does not allow for more than 2^15 weak symbols within one binary. Unfortunately, there are codebases out there run into this limit, e.g. servo/servo#12393 (comment).

So, if we want to keep supporting mingw, we need to find a way to avoid symbol conflicts for closures without using weak linkage.

For monomorphizations of generic functions the same issue exists but in that case it has been easier to solve because we could just instantiate them within each codegen unit with "internal" linkage, thus avoiding any conflicts. For closures this is currently not possible because the cross-crate inlining infrastructure does not allow for inlining a closure on demand.

I see a few possible solutions, each with its own set of disadvantages:

  1. Extend the inlining-infrastructure to better support closures. Then we could handle closures just the same as monomorphizations. The downside is that this would mean an investment into code that will become obsolete once non-MIR-trans is removed.
  2. Always force MIR-trans for closures from extern crates, then we don't need to inline them. The downside is that MIR-trans might not be ready for that yet and that it's kind of a strange and inconsistent behavior.
  3. Make sure that the symbol-hash of a closure contains the crate-id of the instantiating crate, thus avoid cross-crate conflicts. The downside is that this makes symbol naming more complicated again, i.e. one can't just call instance.symbol_name() and get a nicely predictable result anymore.
  4. Store the set of exported closure instances in crate metadata and never locally instantiate closures that already exist somewhere. The downside is that it means more stuff in metadata.
  5. Drop support for mingw :P

cc @rust-lang/compiler

@michaelwoerister michaelwoerister added the A-linkage Area: linking into static, shared libraries and binaries label Jul 12, 2016
@retep998
Copy link
Member

Drop support for mingw :P

I'd imagine there is a non-trivial amount of people that would complain about dropping support for MinGW.

@hanna-kruppe
Copy link
Contributor

To be clear, if MIR trans is turned on across the board then the problem doesn't show up? Then "drop support for mingw" really means

Drop old-trans support for mingw, force mingw users to use MIR trans even though it's not enabled by default yet

Since the only reasons MIR isn't in orbit yet are one or two blockers and some caution about possible unknown problems, this seems like a reasonable temporary inconvenience.

@michaelwoerister
Copy link
Member Author

I'd imagine there is a non-trivial amount of people that would complain about dropping support for MinGW.

Yes, I didn't really mean that to be a serious option.

@michaelwoerister
Copy link
Member Author

Drop old-trans support for mingw, force mingw users to use MIR trans even though it's not enabled by default yet

The problem with this is that MIR-trans does not solve the problem, it just allows for implementing a solution, which would then not be restricted just to MinGW. If we implement this solution (making an internal copy of closures in every codegen unit), we have to use MIR-trans of closures everywhere.

We could do some special casing for MingW, where closures are treated differently on just that platform but that's kind of messy and we would have to support two different linkage schemes in parallel.

I'm not ruling this option out though. It is a good observation.

@michaelwoerister
Copy link
Member Author

OK, so I have a WIP implementation of the "require MIR-trans on MinGW" option here:
https://github.com/michaelwoerister/rust/commits/internal-closures

It might be even less intrusive than expected since it only requires MIR-trans if one is compiling with more than one codegen-unit per crate. Most people probably wouldn't even notice.

@michaelwoerister
Copy link
Member Author

The solution implemented in the branch mentioned above looks pretty good. It passes make check, it allows for compiling Servo with MingW, and most of its ugliness will go away once we don't have to support legacy trans anymore. I'll make a PR and we'll see if people are OK with merging it.

@nagisa
Copy link
Member

nagisa commented Jul 18, 2016

weak_odr linkage seems to also cause inliner to miss inlining opportunities within the same crate, which is the cause of somewhat-steady influx of perf regression reports.

bors added a commit that referenced this issue Jul 18, 2016
…=eddyb

Run base::internalize_symbols() even for single-codegen-unit crates.

The initial linkage-assignment (especially for closures) is a conservative one that makes some symbols more visible than they need to be. While this is not a correctness problem, it does force the LLVM inliner to be more conservative too, which results in poor performance. Once translation is based solely on MIR, it will be easier to also make the initial linkage assignment a better fitting one. Until then `internalize_symbols()` does a good job of preventing most performance regressions.

This should solve the regressions reported in #34891 and maybe also those in #34831.

As a side-effect, this will also solve most of the problematic cases described in #34793. Not reliably so, however. For that, we still need a solution like the one implement in #34830.

cc @rust-lang/compiler
bors added a commit that referenced this issue Aug 1, 2016
…akis

trans: Avoid weak linkage for closures when linking with MinGW.

This PR proposes one possible solution to #34793, the problem that prevents servo/servo#12393 from landing. It applies the same strategy, that we already use for monomorphizations, to closures, that is, instead of emitting symbols with `weak_odr` linkage in order to avoid symbol conflicts, we emit them with `internal` linkage, with the side effect that we have to copy code instead of just linking to it, if more than one codegen unit is involved.
With this PR, the compiler will only apply this strategy for targets where we would actually run into a problem when using `weak_odr` linkage, in other words nothing will change for platforms except for MinGW.

The solution implemented here has one restriction that could be lifted with some more effort, but it does not seem to be worth the trouble since it will go away once we use only MIR-trans: If someone compiles code

1. on MinGW,
2. with more than one codegen unit,
3. *not* using MIR-trans,
4. and runs into a closure inlined from another crate

then the compiler will abort and suggest to compile either with just one codegen unit or `-Zorbit`.

What's nice about this is that I lays a foundation for also doing the same for generics: using weak linkage where possible and thus enabling some more space optimizations that the linker can do.

~~This PR also contains a test case for compiling a program that contains more than 2^15 closures. It's a huge, generated file with almost 100K LOCs. I did not commit the script for generating the file but could do so. Alternatively, maybe someone wants to come up with a way of doing this with macros.~~
The test file is implemented via macros now (thanks @alexcrichton!)

Opinions?

Fixes #34793.

cc @rust-lang/compiler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries
Projects
None yet
Development

No branches or pull requests

4 participants