Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/draft v standard #165

Merged
merged 8 commits into from
Apr 22, 2021
Merged

Feat/draft v standard #165

merged 8 commits into from
Apr 22, 2021

Conversation

Stebalien
Copy link
Member

No description provided.

@mikeal mikeal self-requested a review March 16, 2020 22:55
@mikeal mikeal marked this pull request as ready for review March 16, 2020 22:55
mikeal
mikeal previously requested changes Mar 16, 2020
Copy link
Contributor

@mikeal mikeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some way of noting that this is a breaking change for any dependent projects.

Especially since this column is not being appended to the end, this is going to change/break the representation that comes back from many CSV parsers. It’s pretty common to throw out the header line and just parse each line into an array rather than parsing each line into a map using the header values as key names.

@Stebalien
Copy link
Member Author

I agree, that's why the tests are failing. But I'm not sure how to actually do that.

@mikeal
Copy link
Contributor

mikeal commented Mar 17, 2020

Is anyone pulling the latest master automatically or are all the known libraries keeping their own cached version?

@Stebalien
Copy link
Member Author

go-multihash is pulling into a submodule but (a) that's locked against a specific commit and (b) we're only using it for tests.

@vmx?

README.md Outdated
Comment on lines 42 to 43
* draft - this codec has been reserved but may be reassigned if it doesn't gain wide adoption.
* standard - this codec has been widely adopted and may not reassigned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about renaming "standard" to "permanent" like the URI Schemes Registry is doing it. That would make clear that having a standard, might still not be enough to get the permanent assignment here.

I think "draft" is a good name.

@vmx
Copy link
Member

vmx commented Mar 17, 2020

In the JS world there is a script that takes the file that generates JS code. It's a manual process.
In Rust, there is not even a script, there it's totally manually.

@rvagg
Copy link
Member

rvagg commented Jul 1, 2020

So, merging this and breaking people's stuff will tell them that they should probably be relying on the header to know what the columns are rather than hard-wiring it in. So I'm +1 on merging and taking care of the consumers that we control and adding in documentation that if you're consuming the table you must use the header to determine column contents.

@ribasushi
Copy link

ribasushi commented Jul 1, 2020

You know - we could just switch to a slightly more extensible format: JSON, YAML, whatever: something that allows us to grow the metadata without rebreaking the world every time.

We can also keep the .csv around by simply having a step that regenerates it from the now-canonical master-spec.

I would volunteer to implement this, together with the necessary CI hooks, provided there is buy in to go that route.

@rvagg
Copy link
Member

rvagg commented Jul 1, 2020

XML please, lashings of it, with namespaces too, and a schema while you're at it!

+1 to a JSON version as the primary. We could even just publish it to npm as is for consumption by js-multicodec and js-multihash (maybe). Let's see what @vmx says.

@vmx
Copy link
Member

vmx commented Jul 10, 2020

For this case I don't like JSON for the lack of being able to add comments, but so is CSV => I think we should go with JSON as canonical representation, and generate CSV out of it (as CSV is really so much easier to read/look things up).

@mikeal
Copy link
Contributor

mikeal commented Jul 10, 2020

Why don’t we just use a markdown table and write a simple script that bulids a .csv and .json file on every checkin using GitHub Actions?

@rvagg
Copy link
Member

rvagg commented Jul 11, 2020

just use a markdown table

Because it's not that much different from a CSV then, we haven't solved very much. We may as well have an "input CSV" an a "consume-me CSV" if we head down that route. The only thing a Markdown table really buys us is the ability to embed it in a large Markdown doc (the table is getting large enough that this probably isn't something we need to be doing) or to discourage it from being consumed as the primary source (documentation can do that job, as can naming).

@vmx is there much more than we'd need in terms of comments than we could fit into a "comments" field for each entry? That can be arbitrary text with comments about anything relating to an entry.

@vmx
Copy link
Member

vmx commented Jul 13, 2020

@vmx is there much more than we'd need in terms of comments than we could fit into a "comments" field for each entry?

Good point. I think that's enough.

Stebalien and others added 3 commits April 13, 2021 10:21
TODO:
* update CI
* Mark stable codecs.
This makes it clearer that there doesn't necessarily be an official standard,
but having wide adoption and being a de-facto standard could still be enough.
@Stebalien
Copy link
Member Author

I don't have the time to refactor how we store multicodecs but I really want this merged so we can more easily merge new multicodec registrations.

table.csv Outdated Show resolved Hide resolved
table.csv Outdated Show resolved Hide resolved
blake2b-496, multihash, 0xb23e, permanent,
blake2b-504, multihash, 0xb23f, permanent,
blake2b-512, multihash, 0xb240, permanent,
blake2s-8, multihash, 0xb241, draft, Blake2s consists of 32 output lengths that give different hashes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is intentionally Blake2b permanent and Blake2s draft?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of anyone using the blake2s codec in practice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also make this argument for most of the blake2b hashes, blake2b-256 is the only one in wide use (because of Filecoin)

@vmx
Copy link
Member

vmx commented Apr 14, 2021

@Stebalien you mention that having "draft" and "permanent" would make merging new multicodecs easier. My impression is that this wasn't really the problem so far when we didn't merge things. I think the problem is that we miss a clear definition what multicodec codes (and their categories) are about. It boils down to: Is it just a random table with numbers in it, or do they have a certain meaning.

@Stebalien
Copy link
Member Author

It boils down to: Is it just a random table with numbers in it, or do they have a certain meaning.

Well, that part usually boils down to "are they using IPLD right". But draft v permanent would make me feel better about adding new codes.

Copy link
Member

@vmx vmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging I'd like to also have the approval from @rvagg, as he also spent quite a lot of time on all this.

@Stebalien Stebalien dismissed mikeal’s stale review April 16, 2021 18:25

So we can actually merge.

Copy link
Member

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the column, its the first-pass of filling it in that I think has issues. We can resolve them later but these are the most important discrepancies from my perspective:

  • only one of the blake2b hashes has widespread usage yet they are all permanent here
  • sha2-256-trunc254-padded and poseidon-bls12_381-a2-fc1 are the CID pairs for fil-commitment-unsealed and fil-commitment-sealed respectively. The former are draft but the latter are permanent.

Second-level concerns I have are:

  • I think the bitcoin* and eth* codecs should be permanent, they are being used in the wild, outside of PL, and as more than just examples (dbl-sha2-256 would come along with bitcoin*)
  • cbor and json would be nice to have as permanent, although I can't point to active non-example uses of them
  • dag-cose is getting more use in the wild I think, but we could wait for input from those users

blake2b-496, multihash, 0xb23e, permanent,
blake2b-504, multihash, 0xb23f, permanent,
blake2b-512, multihash, 0xb240, permanent,
blake2s-8, multihash, 0xb241, draft, Blake2s consists of 32 output lengths that give different hashes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also make this argument for most of the blake2b hashes, blake2b-256 is the only one in wide use (because of Filecoin)

@Stebalien
Copy link
Member Author

only one of the blake2b hashes has widespread usage yet they are all permanent here

My concern is that poking holes in this range could be troublesome if someone actually needs one of the others. But then again, we might as well stick to our policy.

sha2-256-trunc254-padded and poseidon-bls12_381-a2-fc1 are the CID pairs for fil-commitment-unsealed and fil-commitment-sealed respectively. The former are draft but the latter are permanent.

Will fix.

I think the bitcoin* and eth* codecs should be permanent, they are being used in the wild, outside of PL, and as more than just examples (dbl-sha2-256 would come along with bitcoin*)

Fair enough.

cbor and json would be nice to have as permanent, although I can't point to active non-example uses of them
dag-cose is getting more use in the wild I think, but we could wait for input from those users

SGTM.

@Stebalien
Copy link
Member Author

dag-cose is getting more use in the wild I think, but we could wait for input from those users

Jose or Cose or both?

@Stebalien
Copy link
Member Author

@rvagg I've fixed everything but jose/cose.

@Stebalien Stebalien requested a review from rvagg April 21, 2021 19:33
Copy link
Member

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave cose/jose as draft for now

👍

@Stebalien Stebalien merged commit 14a3825 into master Apr 22, 2021
@Stebalien Stebalien deleted the feat/draft-v-standard branch April 22, 2021 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants