Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document why Package manifests won't have a uniform schema in the Registry #18

Closed
f-f opened this issue Apr 26, 2020 · 5 comments
Closed
Labels
discussion document me Improvements or additions to documentation
Milestone

Comments

@f-f
Copy link
Member

f-f commented Apr 26, 2020

In #4 I expressed this concern:

I was thinking about moving the packages folder under v1 too, but decided otherwise.
The reason is that when we change the Package type (or in general when the hash of Registry.dhall will change) we'll make a v2 folder - we wouldn't have to migrate all the files right away, so I think keeping some of them on old versions of the schema it's probably fine.
I don't have a strong opinion on this and I'm fine with either - there's value in having a packages folder for every version: it's more files, but consumers then have the assurance that all packages down that folder match the corresponding type.

As a note, migrating between versions of different schemas is usually possible in pure Dhall.
Fictional example: let's say in v2 we want to go from a name : Text to name : List Text.

Then we could write a migration function in Dhall:

let v1tov2 = \(pkg : ./v1/Package.dhall) ->  pkg // { name = [ pkg.name ] }

in v1tov2

..and if you'd like to migrate some old definition of a package, then it would just be a matter of applying the function to it:

./v1tov2.dhall ./some_v1_package_definition.dhall

This is nice because consumers can choose which version of the schema they want to work with, and migrate the data according to their needs, while at the same time we don't need to duplicate data here at all.

A recap of the problem first: we'll want to change the Package schema over time. How do we handle migrations, old versions, clients coding against one interface, etc?

In the quote above you can find a solution for how to handle data migration, but what's still not clear is how to ensure that clients can pull the manifest files in the schema they expect.

At first I thought we had a choice between these two options:

  1. we keep manifests in the version they've been originally published in
  2. we keep multiple copies of the same manifest, one for each version of the Package schema

Option 2 would be really nice, because e.g. a client that needs to query the manifest for prelude/v5.1.2 could choose to do it for v1 or v2.

..however, we cannot do this, because of the constraint of keeping in the registry repo the exact manifest that was contained in the published tarball.
If we had multiple instances of the same manifest (but in different schemas), then we wouldn't know which one got into the tarball.

This means that clients will have to negotiate the version of the Package schema version they're trying to use.
This means:

  • downloading the manifest
  • try to typecheck with v1
  • try to typecheck with v2
  • ..etc.

This is of course totally fine, as long as we are aware of it.

So all of this needs to be documented.

@f-f f-f added discussion document me Improvements or additions to documentation labels Apr 26, 2020
@hdgarrood
Copy link
Contributor

Hm, to me this sounds a bit awkward for clients. If clients are going to have to handle all versions of the registry, does that mean we are committing to only making changes to the registry type definitions which allow automatic migrations?

@f-f
Copy link
Member Author

f-f commented Apr 26, 2020

@hdgarrood can you detail an example of schema change that cannot be automatically performed?

And to answer your question: yes, but I'll precise that we only have to guarantee this for forward-compatibility: since published packages are immutable, we only have to care about clients being able to read all the schemas, but we don't have to support "downgrading" the schema for clients that cannot support the latest one.

@hdgarrood
Copy link
Contributor

For schema changes we might not be able to handle automatically, how about:

  • Strengthening package name constraints
  • Making a key which was previously optional (say, the repository) required
  • If a particular license is considered not to be open source and dropped from the SPDX spec

@f-f f-f added this to the v1 milestone May 20, 2020
@f-f
Copy link
Member Author

f-f commented Sep 28, 2020

In #76 we introduce the notion that package Manifests will only guarantee forwards-compatibility, so the only changes allowed will be (quoting from the new draft):

  • adding new fields
  • removing optional fields
  • relaxing constraints not covered by the type system

@hdgarrood: by this definition all the changes you listed in the last comment will not be allowed. I think this is fine though, as long as we are careful to be strict in constraints and minimal in defining fields right now, because we can always relax constraints and add fields in later versions.

This said, I think the new draft clarifies the original purpose of this issue, so I think we could close this?

@hdgarrood
Copy link
Contributor

Sounds good 👍

@f-f f-f closed this as completed Sep 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion document me Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants