Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(swarm): expose duration of a connection #3135

Closed
mxinden opened this issue Nov 17, 2022 · 8 comments · Fixed by #3927
Closed

feat(swarm): expose duration of a connection #3135

mxinden opened this issue Nov 17, 2022 · 8 comments · Fixed by #3927
Labels
difficulty:easy getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted priority:nicetohave

Comments

@mxinden
Copy link
Member

mxinden commented Nov 17, 2022

Description

Expose the duration a connection was alive.

I would assume the easiest way to expose this is via SwarmEvent::ConnectionClosed.

/// A connection with the given peer has been closed,
/// possibly as a result of an error.
ConnectionClosed {
/// Identity of the peer that we have connected to.
peer_id: PeerId,
/// Endpoint of the connection that has been closed.
endpoint: ConnectedPoint,
/// Number of other remaining connections to this same peer.
num_established: u32,
/// Reason for the disconnection, if it was not a successful
/// active close.
cause: Option<ConnectionError<THandlerErr>>,
},

Motivation

Needed for monitoring purposes, e.g. to be exposed to Prometheus via libp2p-metrics.

Current Implementation

Are you planning to do it yourself in a pull request?

No

@mxinden mxinden added priority:nicetohave difficulty:easy help wanted getting-started Issues that can be tackled if you don't know the internals of libp2p very well labels Nov 17, 2022
@thomaseizinger
Copy link
Contributor

I've been wondering, would it make more sense to implement a NetworkBehaviour that captures the metrics that we are currently capturing via SwarmEvents?

I think that might be easier to integrate compared to hooking into their event loop.

@mxinden
Copy link
Member Author

mxinden commented Nov 18, 2022

Good thought @thomaseizinger! I think this would be significantly less intrusive than the current approach.

How would one capture events from other NetworkBehaviour implementations? Would one derive NetworkBehaviour for the set of NetworkBehaviour's and then wrap that derived NetworkBehaviour in a metrics NetworkBehaviour?

Something along the lines of:

    #[derive(NetworkBehaviour)]
    struct MyBehaviour {
        floodsub: Floodsub,
        mdns: Mdns,
    }
    let my_behaviour = MyBehaviour::default();
    let my_instrumented_behaviour = libp2p_metrics::Behaviour::new(my_behaviour);

I would guess that this needs some macro magic. Not to say that it is not worth exploring.

@thomaseizinger
Copy link
Contributor

Either a wrapper as you said or we add metrics to the individual behaviours.

@thomaseizinger
Copy link
Contributor

Something like the following could work:

pub struct RecordBehaviour<B, M> {
	inner: B,
	metrics: M
}

impl<B, M> NetworkBehaviour for RecordBehaviour<B, M> where M: Recorder<B::OutEvent> {
	
}

impl Deref for RecordBehaviour<B, M> { }
impl DerefMut for RecordBehaviour<B, M> { }

Instead of Deref we could also offer an inner_mut() function but that might be a bit verbose. It would be nice if the RecordBehaviour is in invisible layer on top.

@tcoratger
Copy link
Contributor

@thomaseizinger Could you provide me with a little more detail on how to implement it (the overall roadmap)? If I understand how to do it, I could support this issue :)

@mxinden mxinden changed the title swarm/: Expose duration of a connection feat(swarm): expose duration of a connection May 1, 2023
@mxinden
Copy link
Member Author

mxinden commented May 1, 2023

I think the above suggestion of a new abstraction requires some exploratory work, along the lines of "How can we make the integration of the current libp2p-metrics crate into ones application less verbose".

In regards to the concrete issue, namely to expose the lifetime (duration) of a connection, I would suggest the following:

  • Add a connection_id: ConnectionId, to SwarmEvent::ConnectionEstablished and SwarmEvent::ConnectionClosed. This is in-line with what we already do in FromSwarm::ConnectionEstablished and FromSwarm::ConnectionClosed.
  • In libp2p-metrics the swarm module, keep track of the establishment time of each connection, indexed by ConnectionId. When tracking a SwarmEvent::ConnectionClosed, calculate the duration of the connection.
  • Introduce a prometheus_client::Histogram in the libp2p-metrics swarm module and track the connection duration there.

What do you think @tcoratger? Is this of some help?

@thomaseizinger
Copy link
Contributor

@thomaseizinger Could you provide me with a little more detail on how to implement it (the overall roadmap)? If I understand how to do it, I could support this issue :)

As Max said, I think this still needs some fleshing out. We should likely not block this issue on it but proceed the way Max suggested above.

I'll open a separate issue to discuss that.

@thomaseizinger
Copy link
Contributor

I'll open a separate issue to discuss that.

Done: #3868

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty:easy getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted priority:nicetohave
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants