Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cargo vendor #23

Open
ehuss opened this issue Sep 6, 2019 · 49 comments
Open

Support cargo vendor #23

ehuss opened this issue Sep 6, 2019 · 49 comments
Labels
implementation Implementation exploration and tracking issues stabilization blocker This needs a resolution before stabilization

Comments

@ehuss
Copy link
Contributor

ehuss commented Sep 6, 2019

The cargo vendor command currently does not know about std dependencies. Will need to figure out how to handle the root dependencies, since they are special.

Source replacement may also be broken, I haven't tried it.

@ehuss ehuss added the implementation Implementation exploration and tracking issues label Sep 6, 2019
@Gankra
Copy link

Gankra commented Oct 30, 2020

Firefox is running into this for tsan CI

I've currently hacked up a script that can weave in the std tomls to a project's vendor, but yeah it'd be nice if this was handled automatically: https://gist.github.com/Gankra/6a025b5cce9d9b047e46a5caeded3050

@alexcrichton
Copy link
Member

One thing to consider for this is that vendoring the standard library's dependencies effectively locks you to a compiler version since the dependencies are likely to drift over time. This may or may not be something that Cargo wants to provide a first-class warning/error about rather than a bland "couldn't find crate version in registry" error.

In theory this integration wouldn't be too too hard though, mostly just loading up the Resolve for the standard library, around here-ish

@ehuss
Copy link
Contributor Author

ehuss commented Oct 30, 2020

Copying my comment from Zulip: We have recently discussed adding all the dependencies to the rust-src component for other reasons (matklad was interested in rust-analyzer support). That would be a potential solution of just not changing cargo vendor, and keeping the standard library in sync with the compiler.

I'm uncertain how we can wire that in to the resolver, though. It may be tricky to use source replacement or paths overrides (they cannot currently be customized independently). An option would be to add patches, but it doesn't know what to patch until after it loads the resolver. Another option would be to surgically modify the Dependency information to be path dependencies when the standard library workspace is loaded. None of those sound terribly appealing to me, though.

@alexcrichton
Copy link
Member

Oh actually I think that's a great idea to inclue the dependencies in rust-src! I think that adding patch entries for all members in the vendor directory of rust-src should work perhaps? I don't think there should be much downside to just iterating through the vendor directory and adding a patch-per-crate. We would want to make sure that any "patch not used" warnings were suppressed as well.

@Gankra
Copy link

Gankra commented Oct 30, 2020

One thing to consider for this is that vendoring the standard library's dependencies effectively locks you to a compiler version since the dependencies are likely to drift over time. This may or may not be something that Cargo wants to provide a first-class warning/error about rather than a bland "couldn't find crate version in registry" error.

This isn't an issue for firefox's usecase in particular. We only need these vendors for our tsan CI, and that uses pinned versions of all the toolchains. Although it is a problem if we want these vendoring checked in and we want people to be able to bump non-std dependencies without having the nightly we use for CI. So it's possible we want to vendor live in CI as part of the build. Or we need to carefully identify deps which are only used by std so we can preserve them across non-std-vendors.

Certainly if the dependencies were just part of rust-src and automagically weaved into the build, that would solve the issue. No need to worry about weaving together different "kinds" of vendor or anything else.

@alexcrichton
Copy link
Member

Yeah pinning versions is too wonky and restrictive, so I think including the dependencies in rust-src is a great solution.

@Gankra
Copy link

Gankra commented Oct 30, 2020

Anything I can do to help the plan along?

@ehuss
Copy link
Contributor Author

ehuss commented Nov 1, 2020

I think the first step is getting the dependencies included in the rust-src component (the relevant code is here). Unfortunately, cargo vendor doesn't have a way to only vendor a subset of packages. I'm not sure how to address that. Some ideas:

  • It could call cargo vendor and then just delete everything it doesn't need.
  • cargo vendor could be enhanced to only vendor a subset.
  • The standard library could be split into a separate workspace (matklad is promoting this, but I am concerned about the downsides).
  • Some hacky solution of manually extracting the minimum set of dependencies in rustbuild.

None of these options seem easy.

@Gankra
Copy link

Gankra commented Nov 2, 2020

I believe the approach I'm using to test out tsan builds right now would work for building "subset vendors": https://gist.github.com/Gankra/6a025b5cce9d9b047e46a5caeded3050

Essentially just temporarily copying the root lockfile to each individual library and then cargo vendor -s'ing them in. Vendor doesn't pull things in just because they're in the lockfile, but it will respect the constraints it applies to the dependency graph.

This would be a fairly maintainable hack as you would only need to directly -s in std and test (and maybe panic_abort for good measure?). Everything else is transitively pulled in by those ones. (hmm, maybe only test is actually needed since it explicitly depends on std and both the panic crates?)

What do you think of shipping that?

@alexcrichton
Copy link
Member

I think that's probably bordering on the 4th bullet of @ehuss's comment above, but I think something along those lines may be workable. As we do for build-std in general we're just trying to load one "root" package and vendor that, so the filtering in cargo-vendor probably isn't the hardest thing in the world. Once you're inside cargo-vendor you've got access to lots of fun internals of Cargo as well which the CLI may not expose.

@Gankra
Copy link

Gankra commented Nov 3, 2020

Would -Zbuild-std also need to be reworked, or would the existence of a vendoring .cargo/config at the root of the rust-src directory automagically do the right thing?

@alexcrichton
Copy link
Member

We would need to update Cargo's implementation of -Zbuild-std as well to automatically inject [patch] entries for crates found in rust-src

@Gankra
Copy link

Gankra commented Nov 5, 2020

Ok I'll be working on this for the next week or two, hopefully won't be too messy.

Gankra added a commit to Gankra/rust that referenced this issue Nov 5, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Nov 8, 2020
…lacrum

Vendor libtest's dependencies in the rust-src component

This is the Rust side of rust-lang/wg-cargo-std-aware#23

Note that this won't produce a useful result for `cargo -Zbuild-std` if there are multiple versions of a crate vendored, but will otherwise produce a valid vendor dir.

See rust-lang/cargo#8834 for the other half of this change.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Nov 8, 2020
…lacrum

Vendor libtest's dependencies in the rust-src component

This is the Rust side of rust-lang/wg-cargo-std-aware#23

Note that this won't produce a useful result for `cargo -Zbuild-std` if there are multiple versions of a crate vendored, but will otherwise produce a valid vendor dir.

See rust-lang/cargo#8834 for the other half of this change.
Gankra added a commit to Gankra/rust that referenced this issue Nov 9, 2020
@Ericson2314
Copy link

Ericson2314 commented Jan 22, 2021

@ehuss isn't the decision here about when/where to use the vendor, not when to create it? When doing -Z build-std it uses whatever replacement source the crate provides, globally but conditionally. The fact that this vendor isn't always the same I thought was OK, because source replacement is transparent, so the project workspace won't need to care. The std workspace of course will care what is and isn't there since it has no crates.io to fall back on, but it varies with the vendor so I think it would be OK.

@ehuss
Copy link
Contributor Author

ehuss commented Jan 22, 2021

Sorry, I'm having a hard time understanding what you are saying.

isn't the decision here about when/where to use the vendor, not when to create it?

The intent here is that the std dependencies are always vendored (located in $SYROOT/.../lib/rustlib/src/rust/vendor/...). The issue is that it is not easy to have Cargo conditionally use vendoring for one workspace versus not in another (as described above in #23 (comment)), because it is a global setting (global in the sense of within the Cargo process, since Config is essentially a global).

When doing -Z build-std it uses whatever replacement source the crate provides, globally but conditionally.

Crates don't provide source replacement. Source replacement is driven by the configuration files, which can be local or global relative to the workspace. If you mean conditionally within Cargo, that's what I explained above that the global Config is not designed to be conditional or easily cloned.

The fact that this vendor isn't always the same I thought was OK, because source replacement is transparent

I don't know what is meant by "transparent" here. It is transparent in the sense that vendoring shouldn't behave any differently from using the registry sources. It is not transparent in the sense that all dependencies must be vendored, otherwise it is an error (it does not fall back to anything).

@Ericson2314
Copy link

It is not transparent in the sense that all dependencies must be vendored, otherwise it is an error (it does not fall back to anything).

Ah OK so this is what is tripping us up. I saw https://doc.rust-lang.org/cargo/reference/source-replacement.html#local-registry-sources which says it's OK to have this subset mirror.

Now, supposing that either the book is correct, or the book can be made correct:

It is transparent in the sense that vendoring shouldn't behave any differently from using the registry sources.

Whew :)

Crates don't provide source replacement. Source replacement is driven by the configuration files, which can be local or global relative to the workspace.

I think thats fine. By conditionally, I mean -Z build-std and information from the current compiler / rust-src component would be mixed in at configuration loading time, just like information from multiple config files is already. Everything together would create the single global Config, as today.

Independently from that, -Z build-std would also influence the std workspace as today. So the std workspace would go looking some compiler-specific crate source code, and independently by the config-time mechnaism, the otherwise-unonbtainable source containing those crate source code would be replaced in.

@ehuss
Copy link
Contributor Author

ehuss commented Jan 22, 2021

Ah OK so this is what is tripping us up. I saw https://doc.rust-lang.org/cargo/reference/source-replacement.html#local-registry-sources which says it's OK to have this subset mirror.

By "subset" it is saying that you don't have to mirror all of crates.io locally. It doesn't mean that it will allow something to be missing from the subset. Source replacement works as a "full replacement", not as an overlay.

Also, a local-registry is not what we are talking about here. The vendoring mechanism we are talking about is a directory-source (ala cargo vendor).

@Ericson2314
Copy link

@ehuss OK thanks for the clarifications. Two more questions:

  1. @Gankra's PR both changed the rust-src component format along with trying not make cargo vendor, work right?
  2. The regression just when using cargo vendor, or even without using it? I thought it was the latter, and due to the way the new rust-src was patched in.

My previous comments frankly forgot about the original issue --- cargo vendor --- and were all about what went wrong in the regression case, which as mentioned I thought happened regardless of whether cargo vendor was actually attempted.

@ehuss
Copy link
Contributor Author

ehuss commented Feb 1, 2021

The original changes made Cargo treat the standard library dependencies as always vendored (with -Z build-std), whether or not there is a local source replacement. So it is independent if you run cargo vendor locally.

There were multiple regressions as detailed in #23 (comment). These happened with all -Z build-std builds, and also affected distribution builds of rustc itself.

@Gankra
Copy link

Gankra commented Feb 16, 2021

Hmm ok so that was all a bunch of confusion and didn't actually help us figure out the "right" solution, eh?

@ehuss this stuff is really out of my league -- have you had a chance to mull over the possible solutions?

As far as the partial vendoring thing goes, I'm starting to lean towards just giving up on partial vendoring and just grabbing the whole rustc graph. AIUI that should be fairly unproblematic and also not need to pull in the really heavy stuff like llvm?

Making patching in the vendor totally seamless is the nasty bit I really can't wrap my head around. Is it potentially as simple as combining the WIP patches you and Alex posted?

If we don't have any clear solutions, firefox does have the nuclear option of maintaining patches for cargo for our builds, since my solution did work fine for us. But obviously that's a gross mess we'd rather not maintain.

@cuviper
Copy link
Member

cuviper commented Feb 16, 2021

As far as the partial vendoring thing goes, I'm starting to lean towards just giving up on partial vendoring and just grabbing the whole rustc graph. AIUI that should be fairly unproblematic and also not need to pull in the really heavy stuff like llvm?

There's still some heavy stuff in there -- you can see that in the full rustc-1.50.0-src.tar.xz in vendor/:

$ dua | tail
   7.69 MB jemalloc-sys
   7.88 MB winapi
   9.75 MB vte
  11.63 MB libgit2-sys
  12.16 MB libnghttp2-sys
  25.01 MB curl-sys
  27.65 MB openssl-src
  53.83 MB winapi-i686-pc-windows-gnu
  56.12 MB winapi-x86_64-pc-windows-gnu
 382.91 MB total

@raphaelcohn
Copy link

This is clearly hard to fix - and after my error report in #65 I discovered how hard. @ehuss is completely right about the collision of dependencies in a workspace when vendored - my workspace was using hashbrown 0.9.1, yet the std dependencies needed 0.9.0 and bang! conflict. Which surprised me, until I recalled the point of a workspace is to provide consistent versions.

That means in practice, either I have to track the exact same matching package versions as std (not a particularly easy thing to simply know without a bit of digging) or give up building std. Or provide a sensible mechanism for 'overriding' package version choices in std. Also not nice. Or providing the ideal of cascading workspaces, which works, but is complex. Since std does not expose its private dependencies (eg hashbrown), this is viable and defensible - but tough.

Along with several common packages not playing nicely with cross compilation (libc and cc, I'm looking at you) without wrapper scripts setting up env vars et al, this is just a bit too hard. Giving up for now, personally. All I wanted to do was find a better way to build linux x86_64 musl targets with a few more target features (eg avx512 stuff)...

@Ericson2314
Copy link

@Gankra

Hmm ok so that was all a bunch of confusion and didn't actually help us figure out the "right" solution, eh?

Yes, haha.

But, still believe #64 is a solution here

Which surprised me, until I recalled the point of a workspace is to provide consistent versions.

Heh, well !64 would at least make us have to face down this problem regardless of vendoring. Without knowing all the details, this seems wrong. Per https://github.com/rust-lang/rfcs/blob/master/text/1977-public-private-dependencies.md std should have private dependencies and that is the basis by which this is allowed. Any vendoring, what's in/not-in the workspace, etc. should be completely transparent and independent of the "platonic" build plan.

osa1 added a commit to dfinity/motoko that referenced this issue Oct 6, 2021
Vendoring dependencies is causing a lot of problems in #2761 and other
PRs, because `cargo vendor` currently has a bug and cannot vendor
standard dependencies (deps of alloc and std, see
rust-lang/wg-cargo-std-aware#23).

For a long time I thought vendoring is a necessity and tried to work
around this issue by vendoring dependencies manually, or using
https://github.com/nix-community/naersk.

However, it turns out vendoring is not necessary, and we can download
dependencies in build step just fine. I also don't see any advantages of
vendoring the dependencies. So in this PR we remove vendoring.

Unblocks #2761.
@baloo
Copy link

baloo commented Oct 21, 2021

For nix users where cargo-vendor is the only option available, I found a "workaround" in downloading rust's compiler dependencies manually and merging them.
The result can be found here:
baloo/nixos-firmware@e108c50

mergify bot pushed a commit to dfinity/motoko that referenced this issue Oct 31, 2021
This PR bumps rustc version and removes xargo. std dependencies are now built
with cargo's new `-Zbuild-std` parameter.

To work around rust-lang/wg-cargo-std-aware#23, we implement a tool
"vendor-rust-std-deps" that reads `Cargo.lock` of a Rust toolchain std and
manually vendors all the dependencies.

These dependencies are then merged with the Rust RTS dependencies before the
building the RTS in nix.

RTS README updated with instructions to bump rustc.

New rustc will enable more const functions, new API functions, stabilizations,
and features for other RTS PRs.
@kgrech
Copy link

kgrech commented Feb 25, 2023

The easiest workaround is probably something like this:

RUST_DIR="$(rustc --print=sysroot)"

# Copy the Cargo.lock for Rust to place the `cargo vendor` will see
cp "$RUST_DIR"/lib/rustlib/src/rust/Cargo.lock "$RUST_DIR"/lib/rustlib/src/rust/library/test/

cargo vendor \
    --manifest-path Cargo.toml \
    --sync "$RUST_DIR"/lib/rustlib/src/rust/library/test/Cargo.toml \
    cargo_vendor

# Remove copied file
rm "$RUST_DIR"/lib/rustlib/src/rust/library/test/Cargo.lock

@ehuss ehuss added the stabilization blocker This needs a resolution before stabilization label May 3, 2023
@Dev380
Copy link

Dev380 commented Sep 2, 2023

@kgrech sorry if this is a dumb question, but how would I make cargo use the vendored stdlib when building? cargo vendor says

[source.crates-io]
replace-with = "vendored-sources"

[source.vendored-sources]
directory = "vendor"

but obviously it's not from crates.io and I don't know the registry to tell it to replace

@kgrech
Copy link

kgrech commented Sep 2, 2023

@Dev380 as far as I remember, the std library sources need to be installed using rustup. I think they never fetched from crates.io. it is only for dependencies of std lib, but not stdlib itself.

Not sure I fully understand you question, but you just put what cargo vendor prints to your configuration. vendor is just a folder created by cargo vendor

@kleisauke
Copy link

  • The standard library could be split into a separate workspace (matklad is promoting this, but I am concerned about the downsides).

This has been done via PR rust-lang/rust#128534 (nightly-2024-08-05).

@bjorn3
Copy link
Member

bjorn3 commented Aug 11, 2024

I think it would make sense to provide pre-vendored versions of all standard library dependencies in the rust-src component and have -Zbuild-std unconditionally use those instead of downloading them from crates.io. Unlike adding support for vendoring the standard library to cargo vendor, this will ensure that the right versions for the toolchain that will build the project are available even if a different toolchain version vendored all dependencies.

@weihanglo
Copy link
Member

have -Zbuild-std unconditionally use those instead of downloading them from crates.io.

Haven't followed the entire thread, but I think we may want vendoring as an opt-in feature. Tarball size may not matter (285M vs 297M gz tarball size). However, people may want to patch dependencies when building std. We should at this ensure that case to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implementation Implementation exploration and tracking issues stabilization blocker This needs a resolution before stabilization
Projects
None yet
Development

No branches or pull requests