Skip to content
This repository has been archived by the owner on Dec 6, 2022. It is now read-only.

UnixFSv1 -> v2 upgrade path #24

Closed
Stebalien opened this issue Apr 12, 2019 · 5 comments
Closed

UnixFSv1 -> v2 upgrade path #24

Stebalien opened this issue Apr 12, 2019 · 5 comments

Comments

@Stebalien
Copy link

Stebalien commented Apr 12, 2019

We need to figure out how we're actually going to upgrade from UnixFSv1 to UnixFSv2 while maintaining backwards compatibility without hating ourselves forever.

Requirements: Don't break links.
Goal: Look back in 5 years without crying.


I can think of four ways to do this:

Application Layer

Handle it at the application layer. In this version, we (at the application layer) look at the IPLD format (yuck) and decide if we're dealing with UnixFSv1 (DagPB) or UnixFSv2 (everything else).

Foreign Filesystem

Treat UnixFSv1 as a "foreign filesystem" like in #23. IMO, this is the "most correct" way to handle this however, we'd either need to pick a new prefix (not /ipfs) or do some fancy path conversion at the edges. Unfortunately, it's also going to be hard to make this opaque to the user.

Implicit Transform

Introduce an implicit transform. That is:

  1. Introduce a new IPLD format unixfsv1. The unixfsv1 IPLD codec would decode both the DagPB protobuf and the UnixFS protobuf at the same time and automatically transform them into valid UnixFSv2.
  2. When a user tries to access /ipfs/DagPBThing, rewrite that to /ipfs/UnixFSv1Thing.
  3. When a user links to a unixfsv1 thing from a v2 file/directory, rewrite the codec.

The only concern I have with this approach is sharded directories given that the new HAMT structure is slightly different from the current sharding system. However, I think we'll be able to interpret current sharded directories as valid but non-canonical go-ipld-hamt based directories (I hope).

Note: This is the first selector-friendly option. UnixFSv2 selectors should work on this dag (after the applying the "transform" codec).

Revise History

My final option is to do the transform by redefining the dag-pb multicodec to unixfsv1. That is, literally just say "0x70" (the dag-pb codec) is now "unixfsv1" at the IPLD layer and auto-transform the double-protobuf into a nice unixfsv2 looking node.

The only concern here is existing users as this will break everything except unixfs. Fortunately, I believe the users of this format are:

  1. UnixFSv1.
  2. Our GeoIP database that we use in the webui that we'd like to re-import with CBOR anyways
  3. The pin datastructure which we can easily replace (and plan on replacing).

Given that, this is actually my favorite option. It means new unixfs implementations can push all the legacy logic down into a single IPLD codec.

@warpfork
Copy link

warpfork commented Apr 14, 2019

Nice writeup.

I also like the last two options. I agree with the reasoning: implementing the interfaces to unixfsv1 at the codec level should let us make selectors "just work" over them.

I should note I'm not a person who's the most super familiar with the operational ins and outs of the protoc compiler, so my estimations of complexity and effort requirements for working around that immediate area are probably dodgy. But one way or another, we should be able to crank the multicodec plugin table to make it A) work and B) without cluttering the core go-ipld repo, so whatever needs to be done should be doable.

@mikeal
Copy link
Contributor

mikeal commented Apr 15, 2019

Do /ipfs/ paths support traversal through arbitrary IPLD graphs or are they restricted to traversal within unixfs data structures?

@Stebalien
Copy link
Author

@mikeal they're only supposed to be used with unixfs trees but there are probably a bunch of edge-cases in practice. For one, I don't be leave js-ipfs supports the /ipld prefix.

I should note I'm not a person who's the most super familiar with the operational ins and outs of the protoc compiler, so my estimations of complexity and effort requirements for working around that immediate area are probably dodgy.

Making protobuf nodes work with the streaming {en,de}coder would be tricky but we should be able to write a simple ahead-of-time to streaming adapter for formats like this.

@mikeal
Copy link
Contributor

mikeal commented Apr 15, 2019

In that case, @warpfork could we design a schema for unixfsv1 and apply it to any dag-pb node in the /ipfs/ path engine? This might give a solid set of requirements for the first version of schemas.

@rvagg
Copy link
Member

rvagg commented Dec 6, 2022

closing for archival

@rvagg rvagg closed this as completed Dec 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants