Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add chunked data sync protocol #407

Closed
geoah opened this issue Jun 7, 2020 · 0 comments
Closed

feat: add chunked data sync protocol #407

geoah opened this issue Jun 7, 2020 · 0 comments

Comments

@geoah
Copy link
Member

geoah commented Jun 7, 2020

We need the ability for objects to contain large binary blobs that will not be fetched as part of the object itself, but rather have to be requested explicitly.

Data structure

Following the example of bittorrent, ifps, and other similar protocols we can break the blobs into constant (?) sized chunks, wrap them in an object of a well known type (ie chunk), and add either (option 1) add their hashes as an array or (option 2) create an intermediate object of another well known type (ie blob) to hold them and present a single hash for the blob.

Option 1. Array of chunks.

{
    "type:s": "nimona.io/chunk",
    "data:d": "<chunk-bytes>"
}
{
  "type:s": "foo",
  "data:o": {
    "displayPicture:as": [
      "<chunk-0-hash>",
      "<chunk-1-hash>",
      "<chunk-2-hash>"
    ]
  }
}

Option 1. Blob.

{
    "type:s": "nimona.io/chunk",
    "data:o": {
		"chunk:d": "<chunk-bytes>"
	}
}
{
  "type:s": "nimona.io/blob",
  "data:o": {
    "chunks:as": [
      "<chunk-0-hash>",
      "<chunk-1-hash>",
      "<chunk-2-hash>"
    ]
  }
}
{
  "type:s": "foo",
  "data:o": {
    "displayPicture:s": "<blob-hash>"
  }
}

Protocol

Ideally we should try to merge the protocols for synchronizing streams with the protocol for fetching blobs and use the same primitives where possible.

The protocol can be split into two parts discovery (1) and fetching (2).

Discovery

There is no real benefit for peers to announce individual chunk hashes to the DHT, so peers should only announce the parent object's hash. This way peers can try to lookup other peers who have the root object, and subsequently query them about which chunks they have.

Fetching

Once the peer starts understanding of who has which chunks, they can start requesting them. This request should be the same as for any other object.

Previous works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants