Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new metadata protocol #2062

Merged
merged 6 commits into from
Oct 11, 2023
Merged

feat: add new metadata protocol #2062

merged 6 commits into from
Oct 11, 2023

Conversation

alrevuelta
Copy link
Contributor

@alrevuelta alrevuelta commented Sep 21, 2023

Description:

  • Add cluster-id flag to wakunode2.
  • Create WakuMetadata protocol. See feat(66/WAKU2-METADATA): add WakuMetadata Protocol vacp2p/rfc#619
  • Use WakuMetadata and cluster-id in the peer manager, triggering disconnecitons with peers that support a different cluster-id than us. This is applied only if cluster-id!=0 so since cluster-id=0 by default, this prevents from breaking backwards compat.

@github-actions
Copy link

github-actions bot commented Sep 21, 2023

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:2062

Built from a6ad8ce

@alrevuelta alrevuelta force-pushed the poc-metadata-protocol branch 2 times, most recently from 11475b4 to 6914cde Compare September 28, 2023 15:01
@github-actions
Copy link

This PR may contain changes to configuration options of one of the apps.

If you are introducing a breaking change (i.e. the set of options in latest release would no longer be applicable) make sure the original option is preserved with a deprecation note for 2 following releases before it is actually removed.

Please also make sure the label release-notes is added to make sure any changes to the user interface are properly announced in changelog and release notes.

Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I would be curious to know how much longer does a full connection cycle take if we have to wait for an exchange of metadata.

We could wait 2 Waku releases before disconnecting from peer who don't support the protocol.
I feel like it would smooth the transition.

As for the live updating of supported shards, let's just do the same as with discv5.

Copy link
Member

@richard-ramos richard-ramos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this implementation include some kind of backoff mechanism to avoid reconnecting to the same peer frequently? (which is something that could happen with discv5 as it frequently returns the same peers)

@alrevuelta
Copy link
Contributor Author

I would be curious to know how much longer does a full connection cycle take if we have to wait for an exchange of metadata.

Guess 1 RTT. For relay not something that matters imho, since its doesnt block anything. But for req/resp protocols like store or filter, it will indeed add a small delay = to 1 RTT i guess.

We could wait 2 Waku releases before disconnecting from peer who don't support the protocol.
I feel like it would smooth the transition.

Now that I think about it, another option would be to only use this protocol and force a disconnection if clusterId=1 (aka The Waku Network). Or well, if clusterId!=0. And use clusterId=0 as default. With this:

  • old nodes can connect to new nodes that haven't changed the deployment (clusterId is not set and defaults to 0), so no disconnection happens.
  • new nodes explicitly deployed in TWN clusterId=1 will trigger the disconnection, so they won't connect to i) old nodes or ii) new nodes configured in a different cluster.

Then in future releases we can remove this check to enforce it always. wdyt @SionoiS ?

As for the live updating of supported shards, let's just do the same as with discv5.

Sure, leaving it for another PR. Tracker: #2105

Should this implementation include some kind of backoff mechanism to avoid reconnecting to the same peer frequently? (which is something that could happen with discv5 as it frequently returns the same peers)

imho its not needed, since cluster-ids are also filtered in the discv5 layer. I mean, you won't try to connect to a peer in a different cluster, since we filter out that peer based on the ENR. This protocol is specially usefull in inbound connections, since you dont have any control of that, and most likely you dont have the enr to know theh cluster.

@SionoiS
Copy link
Contributor

SionoiS commented Oct 4, 2023

Guess 1 RTT. For relay not something that matters imho, since its doesnt block anything. But for req/resp protocols like store or filter, it will indeed add a small delay = to 1 RTT i guess.

How many round trips happen currently? Do you know? What's the % increase.

Now that I think about it, another option would be to only use this protocol and force a disconnection if clusterId=1 (aka The Waku Network). Or well, if clusterId!=0. And use clusterId=0 as default. With this:

  • old nodes can connect to new nodes that haven't changed the deployment (clusterId is not set and defaults to 0), so no disconnection happens.
  • new nodes explicitly deployed in TWN clusterId=1 will trigger the disconnection, so they won't connect to i) old nodes or ii) new nodes configured in a different cluster.

Then in future releases we can remove this check to enforce it always. wdyt @SionoiS ?

Good idea! Do you want to change the RFC to include cluster id 0 == default?

@alrevuelta alrevuelta marked this pull request as ready for review October 5, 2023 11:17
@alrevuelta
Copy link
Contributor Author

PR is ready for review. As discussed, disconnections are only triggered if clusterId!=0 AND myClusterId!=remoteClusterId. This prevents from breaking old/new nodes compatibility. clusterId=0 is used as default, so acts as a "feature flag" by now.

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One question re housekeeping, but the approach makes sense to me.

conn = await pm.switch.dial(peerId, WakuMetadataCodec)
except CatchableError:
info "disconnecting from peer", peerId=peerId, reason="waku metadata codec not supported"
asyncSpawn(pm.switch.disconnect(peerId))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without having looked into the details it's not immediately clear to me if this disconnect would also trigger a proper cleanup of the peer in the peer store/manager so that we don't keep stale peers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, good catch. removing the peer here:
dbd692a

Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall but I feel like we should keep in mind the apps that will not use TWN or use their own discovery process.

This metadata protocol should not be a requirement, just like relay is not required.

Also various nitpicks.

tests/test_peer_manager.nim Outdated Show resolved Hide resolved
tests/test_peer_manager.nim Outdated Show resolved Hide resolved
waku/node/peer_manager/peer_manager.nim Outdated Show resolved Hide resolved
waku/node/peer_manager/peer_manager.nim Outdated Show resolved Hide resolved
waku/node/peer_manager/peer_manager.nim Outdated Show resolved Hide resolved
waku/node/peer_manager/peer_manager.nim Outdated Show resolved Hide resolved
waku/node/waku_node.nim Show resolved Hide resolved
waku/node/peer_manager/peer_manager.nim Show resolved Hide resolved
waku/node/peer_manager/peer_manager.nim Outdated Show resolved Hide resolved
Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Thanks.

I'm still not sure about the use case where Waku is used without TWN. I'll think about it.

@alrevuelta alrevuelta merged commit d5c3ade into master Oct 11, 2023
9 of 10 checks passed
@alrevuelta alrevuelta deleted the poc-metadata-protocol branch October 11, 2023 06:58
@fryorcraken fryorcraken added the E:1.4: Sharded peer management and discovery See https://github.com/waku-org/pm/issues/67 for details label Oct 27, 2023
yakimant added a commit to status-im/infra-role-nim-waku that referenced this pull request Nov 7, 2023
yakimant added a commit to status-im/infra-role-nim-waku that referenced this pull request Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E:1.4: Sharded peer management and discovery See https://github.com/waku-org/pm/issues/67 for details
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants