Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report aborted connection from Pool::poll #2369

Closed
wants to merge 8 commits into from
Closed

Report aborted connection from Pool::poll #2369

wants to merge 8 commits into from

Conversation

jmmaloney4
Copy link
Contributor

First attempt at fixing #2329.

Copy link
Contributor

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Bonus points if you add a test case to this :)
Maybe something like:

  1. create new pool
  2. call pool.add_outgoing(...) with a transport (see examples)
  3. pool.disconnect
  4. pool.pool and see if you see a PoolEvent::PendingOutboundConnectionError
    (I haven't tried this, so it may not work)

@@ -50,7 +50,8 @@ use std::{
task::Context,
task::Poll,
};
use void::Void;

use self::task::PendingConnectionCommand;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, I would not use ... this directly and instead follow the same pattern as EstablishedConnectionCommand referencing it by the task namespace. E.g. task::PendingConnectionCommand

@jmmaloney4
Copy link
Contributor Author

I tried adding a test case, but I'm not sure how to call the Pool::poll method without an async context. What would be the right way to do that in a test case?

@MarcoPolo
Copy link
Contributor

Nice work @jmmaloney4 I filled in some of the missing pieces here: 0c61938.

I had to move the test to be within the module since the pool module is pub(crate) (so not usable from an external crate like those test).

Feel free to cherry pick that commit :)

@jmmaloney4
Copy link
Contributor Author

Thanks! I took your code and moved it into my branch, and I also tried to make the TestHandler implementations shared, but I'm not sure if I did that in an appropriate way--I had to make the util module pub here:

pub mod util;

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Left some comments inline.

Either::Left((Err(oneshot::Canceled), _)) => {
unreachable!("Pool never drops channel to task.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite true. It is dropped immediately after abort finishes because we are consuming self in there.

I've read through #2329 but I don't quite understand why we need to send a dedicated value now and the drop notifier isn't enough?

  1. We always initialize abort_receiver with Some so there is never a case where we call abort and wouldn't be able to send PendingConnectionCommand::Abort.
  2. We are still dropping the entire struct at the end of abort so all the data is gone anyway.
  3. We are not sending any information along with PendingConnectionCommand::Abort.

Unless I am missing something, reacting to Canceled should be equivalent to what we currently have here? The abort method can just consume self and do nothing with it which will immediately drop it and trigger the drop notifier. That will still be the same public API albeit it may be worth considering removing abort and using std::mem::drop instead if we want to be explicit about the dropping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite true. It is dropped immediately after abort finishes because we are consuming self in there.

Maybe there's a misunderstanding. The thing that gets dropped is PendingConnection, the thing that has the abort_notifier is PendingConnectionInfo. The PendingConnectionInfo stays around in the hash map of the pool.

Whether you actually need to send abort vs just taking the notfier and dropping it. I think you can do the same thing by dropping the notifier. The reason you may want the abort value here is to make the distinction between this was dropped and we aborted this. Although with the unreachable we're saying we would never drop this (I'm not fully sure that's true, I need to look more closely).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite true. It is dropped immediately after abort finishes because we are consuming self in there.

Maybe there's a misunderstanding. The thing that gets dropped is PendingConnection, the thing that has the abort_notifier is PendingConnectionInfo. The PendingConnectionInfo stays around in the hash map of the pool.

Ah what a subtle difference, I missed that, thank you!

The reason you may want the abort value here is to make the distinction between this was dropped and we aborted this. Although with the unreachable we're saying we would never drop this (I'm not fully sure that's true, I need to look more closely).

That is what my closer look was triggered by! We are adding a way of distinguishing the two and then asserting that one of them never happens. So why add that in the first place?

It seems to be sufficient to change the oneshot::Sender<Void> to an Option<oneshot::Sender<Void>> so we can take and mem::drop it which will trigger the cancellation. That should avoid the unreachable within the select.



#[derive(Debug)]
pub struct TestHandler();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making this handler public, can we move the test you added to tests/ and hence reuse the handler from there? I don't think we should be adding this one to the public about of libp2p-core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To resolve the issue about Pool being pub(crate), I'd suggest to write the unit test against the Network component (which is the public API of libp2p-core).

The structure should roughly be the same:

  1. Make a new Network
  2. dial a peer
  3. Retrieve it with peer
  4. Convert it into a DialingPeer
  5. disconnect the DialingPeer
  6. Observe the aborted event

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! good idea, thanks!


/// Commands that can be sent to a task driving an established connection.
#[derive(Debug)]
pub enum EstablishedConnectionCommand<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my assumption above is correct, I'd vote for undoing this rename to reduce the noise of the diff.

@MarcoPolo
Copy link
Contributor

@jmmaloney4 added the suggestions from @thomaseizinger in my branch here: https://github.com/libp2p/rust-libp2p/compare/master...MarcoPolo:marco/fix-2329-v2?expand=1 (I can't push to this branch). Feel free to cherry pick those commits here.

@mxinden
Copy link
Member

mxinden commented Feb 1, 2022

Friendly ping @jmmaloney4.

@mxinden
Copy link
Member

mxinden commented Feb 14, 2022

There have been some larger changes (#2492). To make things easier on your end I resolved the conflicts. As far as I can there is nothing else missing, thus I am closing here with the goal of merging via #2517.

Please leave any comments on #2517 in case I am missing something.

Thanks @jmmaloney4 @MarcoPolo and @thomaseizinger for the work!

@mxinden mxinden closed this Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants