swarm/: Never call Behaviour::inject_event without a previous Behaviour::inject_connection_established #2337

mxinden · 2021-11-11T14:02:15Z

Imagine the following scenario:

The user calls Swarm::ban_peer_id for peer X.
The Swarm receives a NetworkEvent::IncomingConnection for a not-yet authenticated peer.
The Swarm calls this.network.accept to accept the connection and thus starts the upgrade process.
The Swarm receives a NetworkEvent::ConnectionEstablished from the network for the previously accepted connection. The connection appears to be to peer X which is marked as banned and thus the Swarm starts closing the connection. The Swarm does not call NetworkBehaviour::inject_connection_established.
The Swarm receives a NetworkEvent::ConnectionEvent from the network from the now closing connection to peer X. The Swarm calls NetworkBehaviour::inject_event.

The above leads to NetworkBehaviour::inject_event being called without a previous NetworkBehaviour::inject_connection_established. This violates the API contract between Swarm and NetworkBehaviour. E.g. libp2p-identify expects NetworkBehaviour::inject_event to never be called without a previous NetworkBehaviour::inject_connection_established.

rust-libp2p/protocols/identify/src/identify.rs

Lines 330 to 333 in 8b68026

    
                               .expect( 
        
                                   "`inject_event` is only called with an established connection \ 
        
                                        and `inject_connection_established` ensures there is an entry; qed", 
        
                               );

I see multiple ways moving forward, ordered by preference:

Track the connection state (Established and Closing) in EstablishedConnectionInfo (pool.rs) and ignore any task::EstablishedConnectionEvent::Notify from the connection in case it is in Closing state.
Check Swarm::banned_peers before forwarding an event from the network to the behaviour in Swarm (NetworkEvent::ConnectionEvent).
Drop the promise of Swarm to never call NetworkBehaviour::inject_event without a previous NetworkBehaviour::inject_connection_established and adjust libp2p-identify accordingly.

Credit goes to @divagant-martian for associating the libp2p-identify panic with the Swarm banning system.

//CC @AgeManning

What do folks think?

The text was updated successfully, but these errors were encountered:

AgeManning · 2021-11-12T01:03:45Z

My preference is not to do 3. I think it would be counter-intuitive and lead to more bugs when people build new behaviours.

I agree with your preferences :).

divagant-martian · 2021-11-16T19:40:43Z

Agree on crossing opt 3 out.
Once we receive an event, we check that the connection is present in the pool here I think the most straightforward and correct way to handle it is to do 2) and check here that the connection is (semantically) open/not closed

divagant-martian · 2021-11-16T19:59:43Z

Also, doing some tests I see that inject_connection_closed is called too for the banned peer, for which inject_connection_established was never called. I find this one a bit more complicated since the ban state can come and go, and for a currently connected peer, there can be connections for which inject_connection_established was called and some for which not. What do you think?

MarcoPolo · 2021-11-18T00:16:18Z

@divagant-martian Are you working on this? If not, do you mind if I pick this up? It sounds like we have a rough consensus on the solution here. But to be a bit clearer here's what I'm thinking:

Laws (not sure if this is currently true, but I'd endeavor to make them true):

Banning a peer will result in a disconnect from the peer if you are currently connected.
Banning has no effect on a currently connected peer, except for the disconnect mentioned above. This means if you ban an active peer, you will get inject_connection_closed for that peer when the connection is closed.
Once a peer is banned, a network behavior will never receive any events from that peer.

AgeManning · 2021-11-18T01:42:16Z

@MarcoPolo

This issue is rather time-sensitive for us, because the current master-libp2p branch panics when using identify. I think @divagant-martian has made progress (i'll let her chime in) but this is something we'd like to remedy quickly (I'm not sure about how quickly you think you would resolve this one).

MarcoPolo · 2021-11-18T01:44:24Z

That sounds good. This seems pretty important so I also wanted to make sure someone was working on it

mxinden · 2021-11-18T12:54:15Z

Capturing a conversation with @divagant-martian out-of-band.

Most favorable option at the moment.

When banning a peer:

Track the connections that have already been reported as open to the NetworkBehaviour for said peer.
Basically:

diff --git a/swarm/src/lib.rs b/swarm/src/lib.rs
index 0059a21a..ad4ee426 100644
--- a/swarm/src/lib.rs
+++ b/swarm/src/lib.rs
@@ -273,7 +273,7 @@ where
     external_addrs: Addresses,
 
     /// List of nodes for which we deny any incoming connection.
-    banned_peers: HashSet<PeerId>,
+    banned_peers: HashMap<PeerId, HashSet<ConnectionId>>,

Close the connections.
When receiving an event from a banned peer, drop it.
When receiving a connection close event for a connection of a banned peer, report it to the NetworkBehaviour if and only if it is tracked as one of the previously reported as open connections. Remove the connection from Swarm::banned_peers.

This should uphold all laws listed above by @MarcoPolo.

@divagant-martian will work on this 🙏. Please keep us posted on the progress here @divagant-martian.

divagant-martian · 2021-11-18T18:13:30Z

Since closing a connection is not immediate, tracking allowed connections for banned peers is still open to at least one other race condition that I can identify:

Peer is not banned, it's connected with connection 1. This connection was reported.
Peer is banned. It's added to banned peers and it's connection 1 is stored as previously allowed. start_close is called for 1.
A connection for the peer is received, and not reported, since it's banned. Let's call this one 2
Peer is un-banned. The entry for the peer is removed. Here we lost the information about 1 being allowed and 2 not being allowed.
Both connections finish their closing. The swarm is notified about this. We can now no longer decide correctly if the connection was allowed or not.

When suggesting to keep track of allowed connections, I would suggest doing it on a per-connection basis, instead on a per peer one. Tracking this can be done in a synchronous manner and it targets directly the root of the issue.

Thoughts?

MarcoPolo · 2021-11-18T18:50:17Z

I think that makes sense. To extend the laws above by one more constraint:

unbanning has no effect on currently active connections. Just like banning has no effect on currently active connects, except for the start of the disconnect.

Tracking this per connection is probably the simplest way to implement these laws correctly. Abstractly:

If a peer is banned, all active connections are still in the "unbanned state", but the connections are made to disconnect.
If a peer is banned, all new connections will be in the "banned state" and thus not emit any events to the network behavior.
If a peer is unbanned (and previously banned), all active connections are still in the "banned state".
If a peer is unbanned, all new connections will be in the "unbanned state".

Does that align with your thoughts?

divagant-martian · 2021-11-18T19:04:20Z

Yeah, I think we are on the same page. Seems straightforward to implement. I'll send a PR in a couple of hours

mxinden added bug difficulty:moderate help wanted labels Nov 11, 2021

mxinden mentioned this issue Nov 15, 2021

*: Prepare v0.41.0 #2342

Merged

divagant-martian mentioned this issue Nov 18, 2021

swarm/: Patch reporting on banned peer connections #2350

Merged

mxinden closed this as completed in #2350 Nov 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

swarm/: Never call Behaviour::inject_event without a previous Behaviour::inject_connection_established #2337

swarm/: Never call Behaviour::inject_event without a previous Behaviour::inject_connection_established #2337

mxinden commented Nov 11, 2021

AgeManning commented Nov 12, 2021

divagant-martian commented Nov 16, 2021

divagant-martian commented Nov 16, 2021 •

edited

Loading

MarcoPolo commented Nov 18, 2021

AgeManning commented Nov 18, 2021

MarcoPolo commented Nov 18, 2021

mxinden commented Nov 18, 2021

divagant-martian commented Nov 18, 2021

MarcoPolo commented Nov 18, 2021

divagant-martian commented Nov 18, 2021

swarm/: Never call Behaviour::inject_event without a previous Behaviour::inject_connection_established #2337

swarm/: Never call Behaviour::inject_event without a previous Behaviour::inject_connection_established #2337

Comments

mxinden commented Nov 11, 2021

AgeManning commented Nov 12, 2021

divagant-martian commented Nov 16, 2021

divagant-martian commented Nov 16, 2021 • edited Loading

MarcoPolo commented Nov 18, 2021

AgeManning commented Nov 18, 2021

MarcoPolo commented Nov 18, 2021

mxinden commented Nov 18, 2021

divagant-martian commented Nov 18, 2021

MarcoPolo commented Nov 18, 2021

divagant-martian commented Nov 18, 2021

divagant-martian commented Nov 16, 2021 •

edited

Loading