Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std Aware Cargo #2663

Closed
wants to merge 14 commits into from
Closed

std Aware Cargo #2663

wants to merge 14 commits into from

Conversation

jamesmunns
Copy link
Member

@jamesmunns jamesmunns commented Mar 16, 2019

Rendered.

Please see Pre-RFC discussion here.

Note: I'm happy to squash/rebase out the multiple commits below. I have included them as they were part of the Pre-RFC discussion.

@jamesmunns
Copy link
Member Author

CC @alexcrichton

@comex
Copy link

comex commented Mar 16, 2019

Regarding "stable features"... I think that might be better thought of as bringing Cargo features to std/core, which are pretty much an entirely different concept from #[feature] features, other than the name. In fact, what needs to be implemented is the opposite: a way to mark Cargo features as unstable.

See also: rust-lang/api-guidelines#95 ("Determine how crates should expose 'unstable' APIs")

Anyway, that's just a matter of semantics.


## Should we allow configurable `core` and `std`

If we are to uphold stability guarantees for all configurations of `core` and `std`, this could require testing 2^(n+m) versions of Rust, where `n` is the number of `core` features, and `m` is the number of `std` features. This would have a negative impact on CI times.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our CI times are already quite stretched; I think bors is the bottleneck overall in our development processes. This would also require 2^(n+m) for reasonably many platforms, not just for one.

If core & std (and probably alloc) are to be configurable, it should, for the foreseeable future be exclusively limited to removing things from the standard library, rather than changing algorithms and whatnot.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if rust opens to embedded, we can contribute plenty CI machines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @rust-lang/infra ^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're certainly not going to test all 2m+n combinations on the CI, this is like mindlessly aiming for 100% code coverage. Having one test for all features disabled and one for all features enabled is more than enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having one test for all features disabled and one for all features enabled is more than enough.

In my experience that's rarely the case - features interact with each other in weird ways.

IMO 100% coverage is impossible, but that should not be the goal. The goal should be to get as close to 100% coverage as possible while using a tiny fraction of resources.

I don't think we can achieve that goal by using fixed rules (like testing everything, or testing just A and B). Achieving the goal is going to require investing time into evaluating which features and features groups makes sense to tests where.

For example, it might make sense to test more combinations on the x86_64 tier 1 targets only, and it might also make sense to set up a weekly cron job that tests even more combinations, random combinations, etc. But which features should be tested in isolation and which ones can be grouped in the tests with other features is something that we should evaluate and constantly re-evaluate as new features get added on a 1:1 basis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnzlbg Cargo features are supposed to be additive, features interacting within libstd is already a bug IMO.

For sure if it turns out A+C+F has weird behavior compared with no features and A+B+C+D+E+F+G features, we could create a special run-make test to guarantee that special behavior in A+C+F.


Another option in this area is to force the use of profile overrides, as specified by [RFC2822](https://github.com/rust-lang/rfcs/blob/master/text/2282-profile-dependencies.md).

## Should providing a custom `core` or `std` require a nightly compiler?
Copy link
Contributor

@Centril Centril Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this should be considered a requirement. We cannot reasonably have stability and at the same time allow people to do this with with a stable compiler.

Copy link

@phil-opp phil-opp Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this already possible on stable? Just specify a core/std dependency in your Cargo.toml to override the default. Rust even automatically imports the prelude of the custom std.

We use this approach in our stm32f7-discorvery crate to augment the core library with the required Future implementations for using async/await on no_std.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh... @alexcrichton ^-- Was this intended and who intended it?

(the Future link is dead)

Copy link

@phil-opp phil-opp Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, fixed it. Basically we provide our own implementation for the parts of std::future that currently use thread local storage (including the await macro).

Copy link
Member

@Nemo157 Nemo157 Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, part of the discussion on the pre-RFC was that overriding std/core on a stable compiler should be fine because actually implementing std/core will require using unstable features that will require a nightly compiler.

EDIT: and if in the future it’s possible to provide an implementation of std/core without using any unstable features, why should the act of switching your dependencies to it require a nightly compiler?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise if you crates can finagle some way to compile on stable but still override libstd, then more power to them! (aka you link to the real std and maybe wrap some of its functionality).

Should cargo provide a way to use the non-patched dependency from the patched one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this already possible on stable?

@phil-opp Unlike [patch.sysroot], having a crates.io dependency that happens to be named core or std only affects your own crate. And when doing so on stable, you’re very limited in what this dependency can do. In particular it cannot (re)define lang items.

Forcing unstable compiler just to be able to pick an std that works on embedded

@aep Do you mean that works with async/await without thread-local storage? IMO this is a failure of async/await, not of the core v.s. std structure.

Copy link

@aep aep Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin yes. async is in core now, so we need to patch core. Hence the need for a way to have alt stable std/core in cargo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aep Sorry, I haven’t followed all the recent development around async. Why do you need to patch core? Is there a path so that you don’t need to in the future?

In addition to being much preferable IMO, very practically for you, changes to async could happen sooner than stabilization of everything needed to define core.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only parts of core that the async_await feature uses are core::{future::Future, task::{Poll, Context}}, these are such trivial constructs that I don't see how you could change them while still having the async_await feature work? (I guess replacing Context would be possible since async_await only passes the type around, but doing that would definitely be regarded as operating outside standard procedures).

async_await also uses some functions from std currently, that is just an implementation defect (rust-lang/rust#56974) and doesn't really seem like a good motivation for stabilizing the ability to provide your own std (it would be much better to put the energy towards fixing the defect so that you can just use the builtin async transform).


When compiling for a non-standard target, users may specify their target using a target specification file, rather than a pre-defined target.

> NOTE: The current target specification is described in JSON, and contains some
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In jamesmunns#1 (comment) you noted that we should be "blind" to this format and what it contains and that stabilizing could be done later.. My question is then: how can we reasonably change anything if people start relying on the format on stable? Stability is not just about what you say, it must also be practical.

With my language team hat on (but speaking for myself only) it is important that we be able to feasibly use other backends than LLVM. It should not just be a theoretical possibility to use Cranelift or some other backend, but practically possible in cargo for e.g. debug builds or whatnot.

I would suggest embracing the otherwise incremental approach of this RFC where we start with other things first and let custom target specs be unstable until we are confident that stabilizing won't cause headaches wrt. backends.

Meanwhile we can also do what should be a straightforward switch to TOML.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the switch to TOML as a first step.

However, I am very hesitant of the idea of stabilising the current custom target format. While there is an RFC describing the format, it feels very much like an unstable implementation detail at the moment, and I've personally put up with many problems with it because it's not meant to be a permanent solution and is not a priority at the moment; I wonder if this reflects other people's thoughts.


On alternative backends: if we do end up stabilising this, the minimum change I think we'd need to make would be to split some options into a backend-options map (like has been done with pre-link-args, for example). For example, you'd be required to write (in the current JSON format):

"backend-options": {
    "llvm": {
        "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
        "target": "x86_64-unknown-none",
    }

    {{other backends could then be added with backwards compatibility here}}
}

You could then only build a target with a specific backend if it has a corresponding entry in the map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Centril being blind to something is just parametricity :). Only rustc cares about the format, cargo need not care about the format at all. What it does need to do however is be able to tell if two machine configs are equal, since Cargo controls caching. So we need a:

abstract type MachineConfig: Eq;

Cargo can forward the contents as raw bytes to rustc (as a CLI arg, or temp file Cargo creates to avoid races, or many other things). As to equality, Cargo can take the hash of the config and use that as a cache key. Sure, this is never complete (concrete formats usually have a more flexible notion of equality), but it is always sound. That's just fine for now.

@Centril Centril added T-lang Relevant to the language team, which will review and decide on the RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-cargo Relevant to the Cargo team, which will review and decide on the RFC. labels Mar 17, 2019
@Centril
Copy link
Contributor

Centril commented Mar 17, 2019

This one was a bit tricky to assign teams to... but:

  • T-cargo -- the cargo aspect of this.
  • T-compiler -- re. support in the compiler itself and target specs.
  • T-lang -- re. the custom target stuff since that has implications wrt. what backends are feasibly possible and not and re. stability... I've added this team in an interim capacity; may change later.

(that's a lot of teams but I expect one team to take the lead and others to review... it may also change...)

Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Implementation detail) I suspect there will need to be special considerations for proc macros and build scripts. My guess is that proc macros will be sensitive to having a different libstd, and may need to only be built with the libstd binaries from the rustc sysroot. This might be quite expensive for some projects.

text/0000-std-aware-cargo.md Outdated Show resolved Hide resolved

Currently, `compiler-builtins` contains components implemented in the C programming language. While these dependencies have been highly optimized, the use of them would require the builder of the root crate to also have a working compilation environment for compilation in C.

This RFC proposes instead to use the [pure rust implementation] when compiling for a custom target, removing the need for a C compiler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if this is a dumb question, would backtrace also require building a C library?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not a dumb question, I hadn't thought of this. @alexcrichton is there a pure Rust version of libbacktrace? Or are there any other C-dependencies you are aware of that would also need to be listed here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct yeah that libbacktrace requires a C compiler right now, and while all the pieces exist in basic forms (e.g. gimli, addr2line exmples for gimli, etc) in only-Rust they haven't been integrated in such a way yet to get pulled into the standard library. (the issue @kennytm pointed out is tracking that)


## Should profile changes always prompt a rebuild of `core`/`std`?

For example, if a user sets their debug build to use `opt-level = 'z'`, should this rebuild `core`/`std` to use that opt level? Or should an additional flag, such as `apply-to-sysroot` be required to opt-in to this behavior, unless otherwise needed?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused on this point. Would it rebuild core/std even if they are not listed in [dependencies] or [patch]? I fear that always rebuilding when using non-default profile settings would be too much disruption. It would also be a little confusing, because there are multiple profiles, none of which match the settings used in the precompiled libraries. Perhaps core/std should only be rebuilt when they are explicitly listed in Cargo.toml?

Another potential issue is that the features to enable for libstd currently are driven by bootstrap's config.toml, so it may not be obvious to the user that they need to enable things like "backtrace" to have feature parity with the defaults (which change per platform). How do you switch to a custom-built std that retains feature parity with the default distribution per platform?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly the open question here. On one hand: if the user states they want "opt-level = 's'", and we can now give them that for libstd too, it would make sense to keep the total program size down.

On the other hand, this could surprise some users, as it could drastically increase a clean build time by rebuilding core and standard for them.

Good call regarding config.toml, I was not aware of how this mechanism worked. In xargo, you can set some (or all?) of these flags using a feature = [ ... ] syntax. I would guess we would need to expose these features as "Cargo Features" rather than config.toml features so the users may configure them, with default-features matching what the CI builds of libstd and libcore currently are.

Copy link
Contributor

@ehuss ehuss Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are "Cargo Features", config.toml just drives the defaults. This is done here.

One thing I'm aware of that is not driven via features, and that is sanitizer support. That is done here. It looks like sanitizers are used on the linux builds, but I don't really know anything about them. Looks like it requires llvm?

Maybe the defaults could be captured somewhere when the distribution is made, and Cargo could read those (maybe in the "src" component? maybe rewrite the Cargo.toml?)? EDIT: 🤔 Except that won't work for non-host targets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a general question whether any dependency needs to be rebuilt for these non-abi-affecting switches, so as usual I hope for a Cargo solution that doesn't special-case core and std.

text/0000-std-aware-cargo.md Outdated Show resolved Hide resolved
text/0000-std-aware-cargo.md Outdated Show resolved Hide resolved
text/0000-std-aware-cargo.md Outdated Show resolved Hide resolved
text/0000-std-aware-cargo.md Outdated Show resolved Hide resolved
shepmaster and others added 3 commits March 18, 2019 15:09
Co-Authored-By: jamesmunns <james@onevariable.com>
Co-Authored-By: jamesmunns <james@onevariable.com>
Co-Authored-By: jamesmunns <james@onevariable.com>
@jethrogb
Copy link
Contributor

Is it necessary to introduce the concept of "stable features" in this RFC? Can this be in a separate RFC from the building/source specification?

I brought this up in the pre-RFC: I think the RFC should say something about the current [patch.crates-io] section in Rust's Cargo.toml.

What about the test crate?

@Ericson2314
Copy link
Contributor

@ehuss

(Implementation detail) I suspect there will need to be special considerations for proc macros and build scripts. My guess is that proc macros will be sensitive to having a different libstd, and may need to only be built with the libstd binaries from the rustc sysroot. This might be quite expensive for some projects.

That's just a cross compiling concern. I'm strongly of the opinion that one should always be cross compiling, and native compiling is just cross compiling where the platform you are building for and the platform you're building on happen to be the same. Under this philosophy, it should be possible to do [path.host.crates-io] and only affect regular dependencies, not build.rs or proc macro dependencies.

(I'm strongly of this opinion having previously refactored two existing build tools to adopt this philosophy, even though there initially didn't at all. It's not too late for Cargo either.)

In today's Rust environment, `core` and `std` are shipped as precompiled objects. This was done for a number of reasons, including faster compile times, and a more consistent experience for users of these dependencies. This design has served the bulk of users fairly well. However there are a number of less common uses of Rust, that are not well served by this approach. Examples include:

* Supporting new/arbitrary targets, such as those defined by a custom target (".json") file
* Modifying `core` or `std` through use of feature flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cargo feature flags? or compiler feature flags (-C target-cpu, -C target-feature, --cfg foo, etc. ?) Or both?

It is necessary to use unstable features to build `core`. To allow users of a stable compiler to build `core`, we would set the `RUSTC_BOOTSTRAP` environment variable **ONLY** for the compilation of `core`.

This should be considered sound, as stable users may not change the source used to build `core`, or the features used to build `core`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't this also be sound for crates on crates.io (or for alloc and libstd) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe mention that even when compiled with RUSTC_BOOTSTRAP unstable features from core are not available to stable users.

In general, the same restrictions for building `core` will apply to building `std`. These include:

* Users of the stable compiler must use the source used to build the current Rust compiler
* Only compile time features considered `stable` may be used outside of nightly. Initially the list of `stable` features would be empty, and stabilizing these features would require a PR/RFC to `libstd`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would have to support these stable features forever, so each feature should go through the RFC process. A PR + mini FCP is not an option for this in my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was meant to be RFC + PR, rather than either.

By using stable feature flags for `std`, we could say that `std` as a crate with `default-features = false` would essentially be `no_core`, or with `features = ["core"]`, we would be the same as `no_std`.

This abstraction may not map to the actual implementation of `libcore` or `libstd`, but instead be an abstraction layer for the end developer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we would still need to allow using [dependency.core] forever and somehow map that to an unified version of the library.

Or in other words: removing the concept of core and std (e.g. into an unified std that uses the portability lint) would be a breaking change if this RFC was merged as is.

I don't how hard would it be to have an unified library, that's provided as a "split" one simultaneously (or in an edition dependent way) for backwards compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this could easily be dealt with by making core (and alloc) a facade over std with a limited set of default active features. As long as the current core -> std dependency order is not observable somehow.

@alexcrichton
Copy link
Member

One of the things I'm worried about is the trigger that causes Cargo to build libstd/libcore/etc. The RFC currently says that only the root crate can do it and it happens when a custom target is used, feature flags are modified, profile settings change, or [patch.sysroot] is present. Some questions I'd have are:

  • What happens though when crates on crates.io change feature flags of core/std?
  • How does Cargo know what to build? Does it always try to build everything up to std? When does it only build core (and maybe alloc)?
  • I agree with @ehuss that changing any profile setting causing a build of std/core may be a bit heavy handed because there's lots of tweaking of profiles right now that probably don't want to build std/core (although there's surely some that do want a recompile)

I'm a little uneasy about how we're going to handle compiler_builtins, but that crate is the embodiment of "just keep everything unstable and somehow get everything to work", so I think it's fine to leave basically all details related to compiler_builtins to the implementation. It may cause issues but they're likely not too hard to overcome.

@Ericson2314
Copy link
Contributor

How does Cargo know what to build? Does it always try to build everything up to std? When does it only build core (and maybe alloc)?

Hmm there should be some language in the RFC saying by default crates are assumed to depend on std and core, but if you explicitly list one of those as a dependency, you get no other implicit dependencies. That way, explicitly depending on core (or core + alloc) means no implicit dependency on std.

Then Cargo simply builds the dependencies which are needed per the normal rules.

@Ericson2314
Copy link
Contributor

What happens though when crates on crates.io change feature flags of core/std?

If creates there are allowed to [patch.*] today, then I have no problem allowing them to patch core and std. Likewise for non-root crates regardless of source/registry. Making this stuff unstable Cargo features is an independent concern.

@newpavlov
Copy link
Contributor

I personally prefer several user facing crates (core, alloc, collections, etc.) over a single std crate with a lot of features. And I think it will be nice to eventually stop shipping pre-compiled std and core, as it will be quite useful for issues like this one: rust-lang/rust#51713

Also I think we should keep in mind PAL proposal: https://internals.rust-lang.org/t/4301

cuviper added a commit to cuviper/rust that referenced this pull request Mar 30, 2019
miri needs to build std with xargo, which doesn't allow stable/beta:
<japaric/xargo#204 (comment)>

Therefore, at this time there's no point in making miri available on any
but the nightly channel.  If we get a stable way to build `std`, like
[RFC 2663], then we can re-evaluate whether to start including miri,
perhaps still as `miri-preview`.

[RFC 2663]: rust-lang/rfcs#2663
Centril added a commit to Centril/rust that referenced this pull request Mar 30, 2019
manifest: only include miri on the nightly channel

miri needs to build std with xargo, which doesn't allow stable/beta:
<japaric/xargo#204 (comment)>

Therefore, at this time there's no point in making miri available on any
but the nightly channel.  If we get a stable way to build `std`, like
[RFC 2663], then we can re-evaluate whether to start including miri,
perhaps still as `miri-preview`.

[RFC 2663]: rust-lang/rfcs#2663
@mark-i-m
Copy link
Member

What is the status of this?

@jamesmunns
Copy link
Member Author

@mark-i-m this is probably due an update to the RFC to take the feedback from the comments, and summarize the open discussion points. Unfortunately I will not have time to address this until early May, but I do plan to continue pushing for this then.

As far as I know, there have not been any critical issues raised, though there are a number of open points that need to be addressed or discussed further.

@Centril
Copy link
Contributor

Centril commented Apr 19, 2019

I have been lax in responding to some points made in this RFC; I hope to change that soon.

@nikomatsakis
Copy link
Contributor

We discussed this RFC in the @rust-lang/lang meeting today! In general, it seemed like there weren't a lot of lang-team specific concerns at this stage, though going forward there may be more. You can watch the video and find some notes here.

On a personal note, I wanted to raise a concern about the use of cargo features: the default-features = false design of Cargo makes it a semver-breaking change to take existing "non-featured" items and put them behind a feature flag. This is a problem for all crates, but it's a disaster for libstd, which can't issue a 2.0 crate. I'd like to see it listed as an unresolved question (or blocking concern) to find some other design before we stabilize anything related to "std features". (One thought I had is that you might have some way to pare back to the "minimal features" as of a particular Rust version -- e.g., default-features = "1.36".)

@ids1024
Copy link

ids1024 commented Jun 7, 2019

the default-features = false design of Cargo makes it a semver-breaking change to take existing "non-featured" items and put them behind a feature flag.

Good point. One way to address this, though it might generally be considered an anti-pattern, would be to not have any on-by-default features, and instead have features to disable things.

Otherwise, I suppose it would be necessary to carefully choose what core features should not be behind feature flags. And then when (perhaps inevitably) a need arises for a way to disable one of those, that would end up with a feature to disable it rather than enable it (in contrast to some other features).

I suppose the other extreme would be to have default-features = false disable the entire standard library, with features to enable different parts. It would then be possible to add finer grained features as needed (and features could be deprecated, and hidden from documentation). This is rather complicated and probably a crazy idea, but in some ways it seems the most robust.

And then if all the crates (not only the top level one) might be specifying different std dependencies...

Luckily, the main point of this RFC can be implemented and stabilized without the "stable features", which could be added later (by initially making cargo reject any attempt to set features for std). Though this loses some, but not all, of the benefits of this feature.

Edit: Another conservative approach: initially ban default-features. Then additive features could be defined and made stable, without first solving this issue. Though this is only useful if there were some additive feature to stabilize (I'm mainly aware of on-by-default std features).


This RFC proposes the following concrete changes, which may or may not be implemented in this order, and may be done incrementally. The details and caveats around these stages are discussed in the [Reference Level Explanation][reference-level-explanation].

In this document, we use the term "root crate" to refer to the Rust project being built directly by Cargo. This crate contains the Cargo.toml used to guide the modifications described below. This would typically be a crate containing a binary application, or a standalone item, such as an `rlib`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that “root crate” is not the right concept. In a virtual workspace, there may not be a root crate at all. As far as I can tell what’s important for the purpose of this RFC is having a root Cargo.toml file where to specify some configuration, whether or not there’s a corresponding compiler artifact.

So “root manifest” may be better here. It refers to the Cargo.toml file pointed to by --manifest-path, or in the current directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes frankly I wish we always had a virtual workspace root. That would make the distinction between a "workspace query" and "create graph solution" a lot clearer.


In this document, we use the term "root crate" to refer to the Rust project being built directly by Cargo. This crate contains the Cargo.toml used to guide the modifications described below. This would typically be a crate containing a binary application, or a standalone item, such as an `rlib`.

1. Allow developers of root crates to recompile `core` (and `compiler-builtins`) when their desired target does not match one available as a `rustup target add` target, without the usage of a nightly compiler. This version of `core` would be built from the same source files used to build the current version of `rustc`/`cargo`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see a reason to limit this to libcore. In other words, I think we should do part 3 from the start and support all standard library crates. Currently this is core, alloc, std, proc_macro, and test. Maybe also their dependencies?

In the same vein, please use “standard library crates” thorough the RFC wherever it currently says “core”.


Users of a stable compiler would not be able to customize `core` outside of these profile settings.

For users of a nightly compiler, compile time features of `core` may be specified using the same syntax used for other crate dependencies. These specified features may include unstable features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax should be similar to that for crates.io dependencies. But I feel rather strongly that it should not be exactly the same. Currently the example below refers to https://crates.io/crates/core. The fact that the crates.io server rejects uploads to that name is a separate concern, unrelated to the meaning of a given bit of Cargo.toml syntax.

How does Cargo know the difference between a sysroot dependency and a crates.io dependency? I think we should not hard-code a list of known standard library crate names.

Cargo already has a concept of “source” for all package, with the default being crate.io. Some keys in the TOML table/dictionary for a dependency can change the source: path, git, and regitry. I think sysroot = true (name to be bikeshedded) should be another one of these, and necessary to refer to standard library crates.

I believe this is compatible (aligns, in the underlying concepts in Cargo) with the [patch.sysroot] syntax proposed later in this RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin the problem is that which libraries are special-cased to the implementation is necessarily an implementation specific detail. We want to move creates to crates.io when they become 100% stable code and not have things break. I think the compiler should provide a "workspace override" which does the [patch.sysroot]-ing and/or we start to distinguish the "default source" from crates.io (the default source may be a composition of sources).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ericson2314 I’m sorry. I feel like I should respond to your message here, but I have a very hard time parsing it :/ It seems based on multiple unstated assumptions. For example:

an implementation specific detail

Are you imagining a world where there are multiple implementations of Rust where they each have their take on the entire toolchain? Today mrustc exists, but doesn’t have its own Cargo.

My objection was to hard-coding a list of names known to be crates of the standard library in Cargo, because Cargo and the standard library are developed separately.

We want to move creates to crates.io when they become 100% stable code and not have things break.

I know from previous interactions with you that you have this idea of the standard library somehow having its source of truth on crates.io. You take it as a given here but I still maintain that this idea is fundamentally contradictory, because of the difference in versioning between these two worlds.

A library gets to pick what other libraries from crates.io it depends on, and it gets to pick their version numbers. A program can even end up with multiple versions of the “same” library in its dependency graph.

A library does not however get to pick what version of the compiler is being used. (Only which versions it tries to be compatible with.) The standard library is by definition what ships with a given compiler. It doesn’t have its own version number, it shares the compiler’s version number.

I think that one can make a valid argument that less functionality should be the responsibility of the standard library. For example, std::sync::mpsc could be deprecated without a replacement in std and https://crates.io/crates/crossbeam-channel be recommended instead. But being shipped with the compiler is what “standard library” means. “Moving std to crates.io” doesn’t make any sense in my opinion.

Copy link

@comex comex Jun 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the standard library works today, but there's no rule that it must always be that way. It's true that some parts of it, mostly in core, are tightly bound to the compiler implementation, such as wrappers around intrinsics and definitions of language items. But significant chunks are not, including most of std itself (all the wrappers around OS functionality) as well as significant parts of core (e.g. unicode, flt2dec, iterator adapters). The current implementations of those APIs may or may not use unstable features, but the APIs don't inherently require it. The deprecation route you mentioned could work in theory, but I don't think users would appreciate a large percentage of the standard library being deprecated.

Personally, I would like to see a new "even more core than core" crate that has the absolute bare minimum amount of code required to expose all of the compiler's functionality. Then the rest can be moved to crates.io, and can be made optional – useful for cases where, say, you're extremely constrained on binary size and want to be in control of every single line of code that actually makes it into the output binary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the rest can be moved to crates.io

rand moved from the standard library to crates.io. Because it was not marked at stable in std (or it was before 1.0, I don’t remember), we could simply remove it from there.

But there’s not a lot of that left that we can do. For APIs that are #[stable] in std, we promised to keep having them in std. At most when can deprecate them, which as you noted has downsides. With that in mind, what does “move to crates.io” even mean? Something can not move if it also stays at its current location. Are you saying you want two copies of the same APIs? How is that useful? Can they diverge, in particular if the crates.io one wants to make breaking changes in a 2.0 version?

can be made optional

This RFC proposes cargo feature flags for standard library crates, which presumably will allow making parts of the standard library optional: you can disable them if you don’t use them. How does “move to crates.io” help with not using something?

for cases where, say, you're extremely constrained on binary size

The decision to move Unicode tables into libcore was based explicitly on them not affecting the binary size of programs that don’t use them, with confirmation from the Embedded WG. (The linker eliminates unused symbols.) So I feel that binary size is not a valid argument for a smaller standard library.

Copy link
Contributor

@Ericson2314 Ericson2314 Jun 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin

Are you imagining a world where there are multiple implementations of Rust where they each have their take on the entire toolchain? Today mrustc exists, but doesn’t have its own Cargo.

My objection was to hard-coding a list of names known to be crates of the standard library in Cargo, because Cargo and the standard library are developed separately.

You may be surprised that I agree with that. When I said implementation-specific I meant rustc, not Cargo. Much unstable exists because it is more tightly coupled with rustc. A different compiler or interpreter may need to be tightly coupled in a different way.

You take it as a given here but I still maintain that this idea is fundamentally contradictory, because of the difference in versioning between these two worlds. [...]

I hear what you are saying. But I don't hear any contradictions. To be clear, we agree that std needs to continue to export everything it does today. But nothing says it cannot reexport stuff rather than implement it itself.

Maybe we don't agree that reexporting like this is desirable, but let's first agree that it's possible. OK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@comex I don't think we can have a single compiler-specific crate unless we allow crates to have cyclic dependencies, but yes glad to hear we both want a clean separation between rustc-specific and rustc-agnostic code.

Copy link

@comex comex Jun 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decision to move Unicode tables into libcore was based explicitly on them not affecting the binary size of programs that don’t use them, with confirmation from the Embedded WG. (The linker eliminates unused symbols.) So I feel that binary size is not a valid argument for a smaller standard library.

I know. But if you don't want something in the binary, it's cleaner to have it not present at all, rather than needing to manually avoid calling something that's always in scope (e.g. char::is_whitespace). This is especially weird with the format machinery (which is baked into libcore functions like panic_bounds_check); I see a comment of yours from last year saying it is possible to avoid it being linked in, but it's definitely nonintuitive what exactly the constraints are.

But sure, a minimal core could also be achieved by putting everything else behind Cargo feature flags, rather than actually creating a separate crate.

By the way, another use case for a minimal core is if you just want to write your own standard library with a different design, and want to be able to replace things like impls on primitives (impl char, impl<T> [T], impl<T> *const T), or the precise list of methods on Iterator (even if some basic functionality has to be fixed for the compiler to codegen for loops). Those things are lang items, so letting third-party crates implement them on stable would require additional work. However, a minimal core is a prerequisite.

With that in mind, what does “move to crates.io” even mean? Something can not move if it also stays at its current location. Are you saying you want two copies of the same APIs? How is that useful? Can they diverge, in particular if the crates.io one wants to make breaking changes in a 2.0 version?

As I imagine it, the source of truth would be on crates.io, but the Rust distribution would also include a copy (which would simply be a copy of some version of the crates.io crate), mainly for backwards compatibility purposes. (Edit: Cargo would probably default to pulling the latest version from crates.io when you run cargo update, though we'd want a better concept of 'minimum compiler version' to ensure it only picks versions that are compatible with the current compiler.). A 2.0 could be done in theory but is probably inadvisable. Benefits of having a separate, stable-source crate include:

  • Ability to patch the source on stable (this RFC already proposes patch functionality but only for nightly).
  • Ability for conservatively-minded users to update to a newer compiler while holding back on changes to std, e.g. the recent wholesale replacement of HashMap with hashbrown or the upcoming replacement of synchronization primitives with parking_lot.
  • On the flipside: ability for experimentally-minded users to test such changes in advance of a full release, without having to also change their compiler configuration (which makes comparison more difficult).

cargo build --target thumbv7em-freertos-eabihf.json
```

In general, any of the following would prompt Cargo to recompile `core`, rather than use a pre-compiled version:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than trying to make an exhaustive list of cases where a standard library crate is re-compiled, instead this could specify that it is not compiled when a identical binary is not available as pre-compiled. Where “identical” is based on the target, the feature flags, the profile settings, the sources, etc.

To nitpick: does “custom” target mean only JSON files? Not targets known to rustc but where a precompiled libcore is not available? Removing this list of condition removes the need to make it accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful point! "explicit invalidation" is basically incorrect by construction. Bad in code, and bad in writing.


#### Stabilization of a Target Specification Format

As the custom target specifications (currently JSON) would become part of the stable interface of Cargo. The format used by this file must become stabilized, and further changes must be made in a backwards compatible way to guarantee stability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify this “must”. Is this RFC proposing to stabilize the JSON file syntax (and set of keys) as-is? Or is it saying that another RFC will need to propose stabilization, possibly with changes? Based on other comments, I think it should be the latter as the set of keys needs work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody contradicted me when I said this can be a black-box the Cargo compares byte-by-byte as a stop-gap. I think trading some cache hits for future compat is a good move here.


## Should Cargo obtain or verify the source code for `libcore` or `libstd`?

Right now we depend on `rustup` to obtain the correct source code for these libraries, and we rely on the user not to tamper with the contents. Are these reasonable decisions?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is at least consistent with the RUSTC_BOOTSTRAP environment variable being a simple boolean flag rather than something more tricky like it used to. (I think it was a somewhat-obfuscated release-specific key?) The stability mechanism is a social contract, it does not do much technically to stop someone determined to use unstable features on the stable channel.


## Should the custom built `libcore` and `libstd` reside locally or globally?

e.g., should the build artifacts be placed in `target/`, only usable by this project, or in `.cargo/`, to be possibly reused by multiple projects, if they happen to have the same settings?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conservative default is target/.

If we are sharing local builds of std between projects in ~/.cargo, why not also share local builds of, say, serde? Or any crate from crates.io, or other source? This is an idea worth exploring, but probably better left to another RFC.


## How do we handle `libcore` and `libstd`'s `Cargo.lock` file?

Right now these are built using the global lock file in `rust-lang/rust`. Should this always be true? How should Cargo handle this gracefully?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is important and needs to be resolved before this RFC can be stabilized.

Related: currently when std depends on a crate from cartes.io like libc, that crate is compiled on CI together with std and shipped with in in the sysroot. Users end up with a sysroot libc which is a separate crate from a libc that they would depend on from crates.io themselves. I believe std does not reexport any struct or enum type from libc but if it did, in a user crate it would be a different incompatible type than the one with the same name from crates.io libc.

Based on the principle that locally-compiled sysroot should behave the same as pre-compiled, I think we want to preserve this separation. Even if Cargo will download them from the same place (and share the download cache), sysroot+crates.io packages should be separate from crates.io packages as far as dependency graph resolution is concerned. (Similar to how a git dependency is separate from a crates.io dependency of the same name.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hasn't private dependencies landed? I think that is the best way to express this. Unlike a non-creates.io-source trick, it also catches std leaking those types in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that someday:

  1. Cargo gets a global cache

  2. Sysroot goes away

  3. stdlib crates can safely have public dependencies

But that is a problem for another day.


Another option in this area is to force the use of profile overrides, as specified by [RFC2822](https://github.com/rust-lang/rfcs/blob/master/text/2282-profile-dependencies.md).

## Should providing a custom `core` or `std` require a nightly compiler?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this already possible on stable?

@phil-opp Unlike [patch.sysroot], having a crates.io dependency that happens to be named core or std only affects your own crate. And when doing so on stable, you’re very limited in what this dependency can do. In particular it cannot (re)define lang items.

Forcing unstable compiler just to be able to pick an std that works on embedded

@aep Do you mean that works with async/await without thread-local storage? IMO this is a failure of async/await, not of the core v.s. std structure.


With the ability to build these crates on demand, we may want to decide not to ship `target` bundles for any users.

This would come at a cost of increased compile times, at least for the first build, if the artifacts are cached globally. However it would remove a mental snag of having to sometimes run `rustup target add`, and confusion from some users why parts of `std` and `core` have different optimization settings (particularly for debug builds) when debugging.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn’t this RFC as-is already make rustup target add unnecessary? (In at least some cases.) When the pre-compiled standard library is not available for a given target, Cargo falls back to compiling it. We could emit a warning for target that are known to have a pre-compiled copy available through rustup but not installed locally.

@Nemo157
Copy link
Member

Nemo157 commented Jun 7, 2019

Good point. One way to address this, though it might generally be considered an anti-pattern, would be to not have any on-by-default features, and instead have features to disable things.

That's not just an anti-pattern, that doesn't work with how Cargo treats features. If crate foo uses an API bar from std, but crate baz enables the no_bar feature which disables that API then you cannot use those crates together in a single build.

I suppose the other extreme would be to have default-features = false disable the entire standard library, with features to enable different parts. It would then be possible to add finer grained features as needed (and features could be deprecated, and hidden from documentation). This is rather complicated and probably a crazy idea, but in some ways it seems the most robust.

This doesn't actually seem that crazy to me (I have released a crate which is empty without any features active), you could even take it to the extreme and start with default = ["everything"] and slowly split up the everything feature. The worst part is the deprecation, the more coarse features could never be removed, even if they were ones like everything that don't really make sense.

And then if all the crates (not only the top level one) might be specifying different std dependencies...

That's seems like how it must work, if std is using normal Cargo features, so that crates can specify which bits of std they use.


The above is written from the perspective that std/core features are just normal Cargo features (other than having some extra way to specify some features as unstable so they cannot be used by stable Cargo). Reskimming the RFC though, none of the example features would work as normal Cargo features. The examples I can find are force-tiny-fmt and force_alloc_system, just judging by their names these do not sound like additive features that a subcrate could reliably enable, if there's something like a corresponding force-huge-fmt what would be the result of building core = { features = ["force-tiny-fmt", "force-huge-fmt"] } from combining all features enabled by dependencies?

@Ericson2314
Copy link
Contributor

Just because line comments can get lost, I wrote what I'm pretty sure is a complete solution for the default-features = false problem in #2663 (comment) . This probably deserves its own RFC, as it's not a stdlib-specific problem or solution. I don't have time to write it from scractch, but I would happily offer lots of feedback for anyone that does.

@ehuss
Copy link
Contributor

ehuss commented Jul 20, 2019

We are closing this PR for the time being to take a step back and break this down into smaller pieces. We have created a new repository at https://github.com/rust-lang/wg-cargo-std-aware/ to continue discussion on individual issues. Everyone is welcome and encouraged to continue discussion on that repo.

We envision separating this into the following tasks, some of which can begin immediately. How these map to future RFCs is yet to be decided.

  • Create a minimal implementation of building a package with unmodified sysroot crates directly from cargo using an unstable nightly-only command-line flag. This involves no additional UI, and the primary goal is to get a basic version working on a small subset of platforms. This likely won't be super useful initially, but it will allow us to work through the implementation details. This work will begin soon.

  • Create a focused proposal on the syntax for how sysroot dependencies are specified in Cargo.toml.

  • Define the specifics of how features will work with sysroot crates.

  • Separately deal with the custom target specification format.

  • Define how custom source sysroot crates can be built.

  • Discuss the long-term goals for how sysroot interaction can work (facades, building from stable, etc.).

We'd like the thank everyone for the discussion so far, it has been very useful! We hope that taking this approach will allow us to better focus on the issues and make incremental progress sooner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-cargo Relevant to the Cargo team, which will review and decide on the RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.