Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to share versions of dependencies between unrelated Cargo packages #5332

Open
matklad opened this issue Apr 9, 2018 · 13 comments
Open
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-vendor S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@matklad
Copy link
Member

matklad commented Apr 9, 2018

Several users of Cargo want the ability to build a set of unrelated Cargo packages (i.e., not a workspace) with a tight control over crates.io dependencies. Tight control, in particular, means:

  • the desire to white list and audit crates.io dependencies.
  • sharing as much versions as possible for all packages.

Currently, it is possible to achieve a similar effect by abusing workspace, but there's a feeling that a more first-class solution is required.

Case studies
  1. Google's Fuchsia. They use a single enormous workspace for all their Rust crates, and invoke cargo-vendor on that: https://fuchsia.googlesource.com/third_party/rust-crates/
  2. Facebook has a single pseudo Cargo.toml with all their 3rd-party dependencies. They invoke Cargo on it to download all the crates, and then use Buck to actually build them
  3. Debian are in a similar situation (they have a white list of packaged crates, and there's "single version" restriction), but they can't use workspaces, because they must work with existing crates. Not sure how exactly this all works out for them, the relevant docs are here: https://wiki.debian.org/Teams/RustPackaging/Policy
  4. Finally Rust itself uses a hack with workspace in rust-lang/rust, which unites rustc cargo and rls, although all of them live in a separate non-workspaced repositories.
Strawman proposal

We can imagine a tool to enable this use-case, which works similar to cargo-vendor, but with a slightly different flavor. Today, cargo-vendor looks at the existing Cargo project, collects all of its dependencies from crates.io and packages them as a directory source.

The proposed tool would start with an explicit "dependencies to vendor" specification file (vendor.toml), which lists a set of packages. This file is the point of audit, and the place to make sure that no duplicate packages are allowed. Then, all other packages would be resolved against these dependencies selection. Corollary: in this system, there's no need for Cargo.locks, because the set of packages is fixed and the resolution is deterministic.

Bonus points:

  • determine that the set of vendored packages is consistent, that is, that each vendored package has its dependencies vendored as well.
  • nice resolution error messages: "you depend on foo 1.0.0, it is not vendored, but exists on crates.io"
  • add ability to generate vendor.toml from a set of leaf Cargo packages.
@sunshowers
Copy link
Contributor

Thanks.

Facebook has a single pseudo Cargo.toml with all their 3rd-party dependencies. They invoke Cargo on it to download all the crates, and then use Buck to actually build them

To be clear, we at Facebook use Cargo to build our third-party dependencies and Buck to build Facebook-internal code.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Apr 9, 2018

4 will be solved by Xargo integration / rust-lang/rfcs#1133 successor.

The other three are solved in other build systems by "constraint" files which can be shared by multiple solving units. Sort of like an imported lockfile. So I'd say do that, and don't make vendor.toml another concept but just allow importing read-only lockfiles.

@luser
Copy link
Contributor

luser commented Apr 10, 2018

FWIW, in Firefox we only just finally switched to using a workspace last month after getting a number of issues sorted out. Prior to that we simply called cargo vendor with multiple --sync options to give it the full set of Cargo.lock files in our repository.

@jsgf
Copy link
Contributor

jsgf commented Apr 10, 2018

Comments on strawman:

  1. We (Facebook) are vendoring both things from crates.io and git repo snapshots. We're currently using [patch.crates-io] to override this consistently.
  2. We want to set features on them
  3. We want to be able to pin them to specific version constraints, but generally use "*" to get newest possible (where the updates to internal code are straightforward).

Corollary: in this system, there's no need for Cargo.locks, because the set of packages is fixed and the resolution is deterministic.

I don't see how this follows? Wouldn't the resolution change as crates.io changes?

@matklad matklad added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Apr 11, 2018
@matklad
Copy link
Member Author

matklad commented Apr 11, 2018

I don't see how this follows? Wouldn't the resolution change as crates.io changes? ... We want to be able to pin them to specific version constraints.

My idea was that you explicitly define the set of vendored packages, which is fixed by some configuration file. The config file is basically a global lockfile, so it lists exact versions, and not version requirements. Then, the resolution of dependencies runs against this set of packages, and not against crates.io. In a sense, this creates a local subset of crates.io ecosystem, restricting the set of packages/versions in the registry.

We want to set features on them

Interesting! To unpacking this, another source of duplication, besides duplicated versions, is packages with different sets of features available. So we may want to forcibly enable features at the global level. Interestingly, rust-lang/rust has this problem and, to hack around it, we had to add a synthetic dependency to Cargo to just enable some features: 841f20a#diff-80398c5faae3c069e4e6aa2ed11b28c0

@jsgf
Copy link
Contributor

jsgf commented Apr 11, 2018

The config file is basically a global lockfile, so it lists exact versions, and not version requirements.

How would one determine what exact version to put in there, and make sure its consistent with everything else? I think we'd want cargo to do that resolution via the existing requirements/lock model.

To unpacking this, another source of duplication, besides duplicated versions, is packages with different sets of features available.

In principle, features are always additive so simply unioning all features should work. But in practice this isn't enforced, and I know some crates use features for exclusive A vs B choices. (I think this is a whole other discussion which needs resolution.)

We're mostly using them to enable optional features and dependencies (like "bytes" dependency on "serde").

@luser
Copy link
Contributor

luser commented Apr 11, 2018

Do you think designing a new format for this would have significant advantages compared to making cargo able to generate a single lockfile for multiple crates as if they were in a single workspace (likely overriding any existing workspace definitions)? If you could do something like cargo generate-lockfile --manifest-path a/Cargo.toml --manifest-path b/Cargo.toml --override-lockfile Cargo.lock to generate a single Cargo.lock, and then somehow instruct cargo to use that when building crates a and b (cargo build --override-lockfile ../Cargo.lock ?) it seems like you could use the existing cargo vendor tool.

Hand-authoring this file doesn't sound great since you'd wind up having to iterate over transitive dependencies. It would force developers to do a bunch of work that cargo already knows how to do.

If your concern is auditing the list of packages then presumably looking at the diff of Cargo.lock after making changes to the set of crates involved would be a suitable way to do that. I've reviewed a series of Firefox patches that changed versions of vendored crates (here's a recent example) and they're generally pretty easy to reason about.

@matklad
Copy link
Member Author

matklad commented Apr 11, 2018

Do you think designing a new format for this would have significant advantages compared to making cargo able to generate a single lockfile for multiple crates?

If your concern is auditing the list of packages

I personally don't really have value judgements here, because I am not too familiar with all use-cases here :)

So looks like, if we forget about features, we really just need a way to generate a lockfile for several workspaces simultaneously. Basically, this loop in cargo-vendor should somehow be inside the version-resolution process.

@jsgf
Copy link
Contributor

jsgf commented Apr 11, 2018

Is this equivalent to nested workspaces?

@matklad
Copy link
Member Author

matklad commented Apr 11, 2018

@jsgf it depends on the definition of a nested workspace :-)

However, I would say that at least for Debian's case of packaging existing crates it does not seem to be the case. One of the fundamental properties of workspaces is that children are aware of the parent workspace (there's workspace = "path" in manifest. Sometimes it is inferred, but conceptually it is there). Debian would like to take stock crates.io crates, and compile them, using only versions of dependencies packaged with Debian.

@migueloller
Copy link

Another good use case for this would be to able to ease development in a monorepo. If there are two unrelated Cargo packages that are maintained separately, it is inconvenient to have to keep shared dependencies between them in sync even though the packages aren't related.

@sanmai-NL
Copy link

sanmai-NL commented Mar 30, 2019

Workspaces as a concept may be an abuse in itself. If you use a monorepo or metarepo, you may alternatively leverage Cargo.lock sharing or vendored dependencies (cargo-vendor). Together with cross-repo references such as Git submodules or Plastic SCM's Xlinks. This works well. We should be cautious about featuritis, adding complicated concepts such as workspaces and then still not being satisfied for non-pragmatic reasons.

@epage
Copy link
Contributor

epage commented May 24, 2023

See also #4353

@epage epage added the S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. label Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-vendor S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
None yet
Development

No branches or pull requests

9 participants