Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this codec in unsigned-varint format? #224

Closed
keepsimple1 opened this issue Aug 5, 2021 · 5 comments
Closed

Is this codec in unsigned-varint format? #224

keepsimple1 opened this issue Aug 5, 2021 · 5 comments

Comments

@keepsimple1
Copy link

keepsimple1 commented Aug 5, 2021

I must have missed something obvious. For example, this item in the current table:

leofcoin-block,                 ipld,           0x81,           draft,     Leofcoin Block

The code 0x81 is binary 10000001, and in my understanding of unsigned varint, should indicate there is a 2nd byte. Why isn't there 2nd byte in the table for this codec?

@vmx
Copy link
Member

vmx commented Aug 6, 2021

The table lists the un-encoded numbers. So in this case 0x81 would mean that if it is e.g. used in a CID, the unsigned varint of it would be 0x8101.

@keepsimple1
Copy link
Author

keepsimple1 commented Aug 6, 2021

May I suggest adding (or just replacing with) the actual unsigned varint encoded code? Because in the real life uses, we are using and seeing encoded numbers.

Just like UTF-8 which uses similar technique, we only care about the actual encoded bytes.

@rvagg
Copy link
Member

rvagg commented Aug 7, 2021

I'm not sure I can think of a case where I've had to manually type out the encoded form, much more typically I'm using the integer form, e.g. https://github.com/ipld/js-dag-json/blob/328eda133620a0aff8cb19a314b5622f0054a6d3/index.js#L198 and I'm regularly copy and pasting from this table so it'd be a massive loss to ditch the human-readable forms for computer-redable ones.

You might be able to make a case for tacking an additional column with the bytes, but we're not dealing with a very extensible format with our CSV and haven't made any progress on a more extensible form unfortunately.

@fabianhjr
Copy link

Hey, @Stebalien I am currently working on some implementations and would appreciate some clarification about this issue.

For the example @keepsimple1 raised (0x81) would it be to take that value (129 in decimal) and then encode it as an Unsigned Var Int (hence the 0x8101 given by @vmx)?

Are the entries then in big endian notation? (So udp multiaddr which is noted as 0x0111 would then be 273 and would then be 0x9102 as an Unsigned Var Int?

@rvagg
Copy link
Member

rvagg commented Jan 31, 2022

  1. Yes, if encoding 0x81 from the table as a varint, then you'd get two bytes, 0x81 and 0x01, or 129 & 1. The multicodec table doesn't dictate the use of varints which is why it's listed as an integer there. Varints are a use-case implementation detail - CIDs (only one use-case of the multicodec table) do use varints and we generally expect they're used because they're compact and let us span a large range without worrying too much about lower, more common values taking up a lot of space.
  2. Have a look at https://github.com/multiformats/unsigned-varint which attempts to describe what "varint" means in the multiformats context, with links to compatible implementations in JS, Rust and Go that you can compare against.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants