Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline Message Queue #2

Open
Stebalien opened this issue Sep 20, 2018 · 18 comments
Open

Offline Message Queue #2

Stebalien opened this issue Sep 20, 2018 · 18 comments

Comments

@Stebalien
Copy link
Member

There's an open problem we'd like to resolve:

  1. User A generates some content and adds it to IPFS. They want to deliver this content to user B.
  2. User A tries to connect to user B. However, it turns user B is offline.
  3. User A goes offline.

Currently, we have no good way to deliver this information to user B. Textile uses the DHT as follows:

  1. User A fails to connect to user B.
  2. User A puts PeerIdOf(UserB) -> CidOf(content) to the DHT (treating the DHT as a sloppy hash table).
  3. When user B comes online, they find themselves in the DHT and download their messages.
  4. Finally, they send an ACK back to user A (using the same mechanism).

Unfortunately:

  1. This is relying a lot on potentially ephemeral nodes.
  2. There's no way to remove these messages from the DHT until they expire (see: Delete messages go-libp2p-kad-dht#196).
@Stebalien
Copy link
Member Author

One alternative is to allow users to choose some always online node as their message queue. Users could either run these themselves or pay someone else to do so. In practice, we (and others) can probably just run this service for free.

Design

Parties

  • Sender - The node sending messages to the receiver.
  • Queue - A node queuing messages for the receiver. This queue may also double as a pinning service.
  • Receiver - The node receiving messages while offline.

Setup

The receiver picks a set of queue nodes and creates and signs a record specifying which queues should be used and, for each queue:

  • during which period the queue is "valid".
  • reliability (should the sender expect an ACK or just assume that the message will go through).
  • preference (which queue should be used first).
  • replication (should multiple queues be used?).

We'll have to balance flexibility with simplicity when designing this policy language.

Finally, the receiver gives this record it's selected queue nodes. These queue nodes are responsible for repeatedly putting this node into the DHT.

Protocol

  1. The sender adds their content to IPFS (well, likely IPLD).
  2. The sender convinces some service to store the content while they're offline (e.g., using a pinning service).
  3. The sender looks up the receiver's "queue" record in the DHT.
  4. The sender connects to the receceiver's queue nodes according to the specified policy, sending the CID of the content to each one.
  5. When the receiver comes online, it checks with its chosen queues for any queued messages, draining the queues in the process.

Drawbacks

The chief drawback with this protocol is that it's not truly peer-to-peer. That is, there are some nodes that must be willing to act as queues and users must be willing to rely on them (possibly paying them).

However, the alternative is to expect random nodes in the network to perform this service (like we do with the DHT). This is fine for ephemeral information that can be frequently refreshed (e.g., the queue records) but less useful for potentially long-lived messages.

@sanderpick
Copy link

sanderpick commented Sep 27, 2018

Thanks for jumping in here @Stebalien. I've been batting around a pretty similar interim setup to your proposal. The key differences being:

  • The Queue node is just our Textile node, but running in "cafe" mode (as a daemon on a server, with an API), the actual queue would live locally, in the SQLite index
  • A client node signals the location of its chosen queue to the network by inserting a property into its IPNS based profile

Of course, it would be amazing to have this functionality in core. Some questions regarding your proposal:

  • Does the Queue node still insert a pointer to the content node into the DHT? Does this mean we'd still need to implement DHT delete?
  • If yes, couldn't the Q just keep a local index of messages (recipient id -> message address) in the repo? Doens't seem like you'd need to publish pointers at all.

@Stebalien
Copy link
Member Author

Does the Queue node still insert a pointer to the content node into the DHT? Does this mean we'd still need to implement DHT delete?

No. The queue node would just tell everyone that it's willing to store messages for user X. The client should already know which queue's it has chosen.

A client node signals the location of its chosen queue to the network by inserting a property into its IPNS based profile

The only difference in this proposal is that it makes the queue node responsible for keeping this record alive. The problem with IPNS is that we make IPNS records expire to prevent replay. Ideally, in this case, the user would insert this information into some IPNS-like record and then distribute this record to all of it's queues. The queues would then republish this to the DHT as necessary (to deal with DHT churn).

@sanderpick
Copy link

Ah, makes sense!

Authentication w/ a queue seems straightforward enough, but what about authorizing a node to be able to start a queue on another node? Could this just be a config setting that nodes opt into? i.e.,

"MessageQueue": {
  "Enabled": true
},

That is to say, if a node has this enabled, any other node can lean on it for this service. Perhaps a client + queue size limit would be helpful, esp. considering the potential for spam.

@Stebalien
Copy link
Member Author

Personally, I wouldn't even bundle this with go-ipfs. Instead, I'd build a new libp2p service providing a queue-server (it's pretty easy to write new libp2p services at this point). Unlike nodes in the DHT, these queue servers would have to be pretty reliable for this system to work.

@sanderpick
Copy link

Sounds good. Couple questions, though I may be conflating solutions...

...the user would insert this information into some IPNS-like record and then distribute this record to all of it's queues.

Would does IPNS-like mean here? I'm likely not familiar enough with the underlying mechanism to infer your meaning.

The queues would then republish this to the DHT as necessary (to deal with DHT churn).

So, the queue node essentially pins the record?

To see how this might play out, I can start by adding handling to our existing service layer (which needs upgrading to the newer, more idiomatic libp2p services).

Down the road, I can break it out into a standalone server, usable by others. Are libp2p hosts able to handle multiple services simultaneously? If not, I suppose we could run multiple since we'll still need the textile thread service, i.e., each of our app nodes would need to run:

  • libp2p message service
  • libp2p application logic specific service (in our case, the thread service)

@Stebalien
Copy link
Member Author

Would does IPNS-like mean here? I'm likely not familiar enough with the underlying mechanism to infer your meaning.

Well, I guess they could just use IPNS. It would just add a layer of indirection because IPNS records can only point to a single path (queues would have to announce both the IPNS records and provider records for the object they point to). However, that's probably the cleaner solution.

So, the queue node essentially pins the record?

Yes.

Are libp2p hosts able to handle multiple services simultaneously?

Yes. You can register as many services as you want.

@sanderpick
Copy link

sanderpick commented Oct 28, 2018

I made some progress here. The textile-go lib now has a libp2p service for handling inboxing. The basic idea is...

  • peer A elects peer B, a (hopefully) always-online node, to handle its offline messages. It registers an inbox w/ peer B by completing a signing challenge. peer B then signs an expiring token (JWT for now) that peer A can use to check is messages. (Note: peer B doesn't have to participate)
  • Anytime a peer cannot directly connect to peer A to send a (application specific) message, they can instead encrypt the message and leave it with peer B (in practice its IPFS address), who first looks to see if they have an inbox for peer A
  • When peer A comes back online, it calls up peer B and downloads any new messages

other stuff:

  • peer A is in charge of message (address) redundancy. It can register multiple inboxes. Message senders can then send the message to each (we can cap how many they will deal with)
  • every peer has an inbox and an outbox queue, which are processed opportunistically

How another peer can determine peer A's inbox(es) is a bit baked into application logic at this point. There's a concept of a "thread" which is a hash tree of state updates, handled with another p2p service. Each update has a header message which contains the author's inbox(es) addresses. I don't currently have a way for a peer to advertise these addresses to network as a whole, though we could add to the public IPNS profile. Will need something there in order to "look up" a newly discovered peer's inbox(es).

Components:

Outgoing messages to other peers (which may be direct or end up in a hosted inbox) are queued as well (I'm trying to achieve truly offline first UX). If direct delivery fails, the (encrypted) message ends up in the cafe outbox (different protocol).

We'll take this around the block for a bit first, but I'd like to spin out the inboxing piece to a standalone p2p service repo. Would be neat to have a standardized way of releasing / including p2p services (maybe there is one?). I put together a base service based on some of cpacia's work that may be useful to others. It takes a protocol and a handler and you're off to the races.

Of course, feedback much welcome... thanks for the help so far!

@thomas92911
Copy link

thomas92911 commented Nov 9, 2018

Thanks for jumping in here @Stebalien.

Offline messages should really be implemented by the service application.
Write dht is not recommended, it is not used in this way.

My initial thoughts:

  1. message based on libp2p-pubsub.
  2. one application use the same one topic.
  3. message-router-service should save message and respond to queries
  4. The message sender and receiver first step is exchange the pub-key (say hello)
    and then use the other's pub-key to encrypt the message.
  5. The message receiver reads the to-addr field and determines whether it is parsing or
    forwarding.
  6. timeprove use TOC (Time On the Chain), not the real time, is the latest block hash(filecoin, btc,....), so it can't be pre-made.

Message struct example:

key pair:  secp256k1
    pubkey = multibase58btc(multikey-secp256k1(secp256k1 pubkey []byte))
    addr = multibase58btc(multihash-sha256(multikey-secp256k1(secp256k1 pubkey []byte)))

message:
    topic: "IMessageV0S0"
    pubkey: pubkey
    sign: private-key-sign(message)
    message:
        type: 0~...
        to: addr
        route: "-"
        ctime: "123456678"
        timeprove: multiprove("000000000000002cc43169bd.......")
        body: multi-pubkey-encrypt("{...}")
        nonce: "54bacdef51a11416"

message size limit: 2000
message body raw data(before multi-codec):
    size limit: 1024
message type:
    hello
    jcard
    query-jcard
    message
    offline-message
    query-offline-message
    message-receipt

@Stebalien
Copy link
Member Author

Stebalien commented Nov 9, 2018

I really can't figure out what you're proposing, why, what exactly you're trying to solve, etc. Could you start off with some example applications and issues with the existing proposal?

@thomas92911
Copy link

thomas92911 commented Nov 10, 2018

Sorry, I am talking about what I want to do and provide it to ipfs developers for reference.

P2p chat app, it is an application based on ipfs network

  1. P2P encryption
  2. Offline message
  3. Each client node is a service node

I don't know if I can make it clear.

(BTW, I am in China, I hope someone can really understand what I want to do...)

@tobowers
Copy link

This is great work and textile is looking awesome. We've been thinking along these same lines, but I've been trying to go down the path of sender keeps the queue rather than receiver. I think it puts the incentives in the right place.

Something like:

  1. Sender negotiates with an online server to keep its outgoing messages
  2. Sender attempts to send message to Receiver (fails)
  3. Sender puts messages into online server which listens to a topic
  4. Receiver comes online and says "I'm here" and online server sends a CID to a queue of messages
  5. Receiver acks the messages.

@dirkmc
Copy link

dirkmc commented May 17, 2019

@tobowers that sounds like it's along the lines of Internet Mail 2000

@tobowers
Copy link

@tobowers that sounds like it's along the lines of Internet Mail 2000

Yeah a bit, but with these new architectures and talking always-on servers we have a new set of tech to rethink how things should be.

@dirkmc
Copy link

dirkmc commented May 17, 2019

Yes I agree, now that most people have a device in their pocket that is essentially always on, the Internet Mail 2000 architecture looks viable to implement on a p2p basis

@SomajitDey
Copy link

SomajitDey commented Aug 27, 2021

@Stebalien Regarding your original post, can't simply adding a tag field to IPNS records help? I have posted a detailed proposal at discuss.ipfs.io (tagged you there) and sketched out a mailing/offline-messaging application based on it. Here's the link for your perusal.

@Stebalien
Copy link
Member Author

@SomajitDey that's equivalent to what Textile was doing and the same problems still apply. My goal was to design something that's more reliable/more robust.

@SomajitDey
Copy link

SomajitDey commented Aug 30, 2021

Dear @Stebalien,
You are right. (Just an off topic thought - But at least such tagged IPNS record would bring what textile is doing, within the purview of Go-IPFS-cli, making our already excellent IPNS even more versatile and helping the core users who don't know libp2p. It's a simple enhancement on top of the existing implementation of ipfs name, without disrupting anything that exist already, or requiring too much new work, right? May be add as an --enable-tag-experiment feature).

Back on topic, in your 2nd post above you wrote:

However, the alternative is to expect random nodes in the network to perform this service (like we do with the DHT).

This would be so elegant. Your analogy with DHT inspired the following strategy. Kindly comment on that.

Nodes can opt to be dhtserver and dhtclient, the latter mainly for efficiency and bw savings. Along similar lines, can't we have two types of IPNS-over-pubsub peers, viz.

  1. those who subscribe to every name-specific topic they discover around them. This is possible because pubsub peers keep track of which topics their directly connected peers are subscribed to.
  2. those who subscribe to user-given names only (as is the case in the existing implementation).

Just as it is expected that there would always be dhtservers in the network, we might always have some pubsub-peers of Type 1 above. In that case, when the sender publishes its IPNS record (tagged with the receiver's peerID) over pubsub, a Type 1 peer that it is directly connected to also subscribes to the corresponding name(tag)-specific topic, and continues to republish as usual when sender goes offline, until the records expire.

There, however, needs to be a mechanism for Type 1 peers to discover one another, get connected, and sync the topic set they subscribe to. Discovery might be achieved in ways similar to how IPNS-over-pubsub nodes discover each other - seeking providers of rendezvous files/DHT keys.

Thanks for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants