-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to broadcast network messages in parallel #1409
Conversation
1e2dc02
to
c900596
Compare
} | ||
|
||
// Here we wait for all the delayed messages to be sent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to add a metric that tells us how much we wait here. This will be helpful to measure the level of back pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best metric type here will be a Histogram, on the other hand it will be quite expensive in terms of the metrics space. WDYT about adding a histogram with low amount of buckets (like 0, 1, 5, 10
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say it would be sufficient to just have a counter that measure how much time is spent there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we still need a histogram for time I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also add a metric for the size of the delayed_messages
queue
How does it perform? |
@@ -1038,21 +1041,58 @@ fn dispatch_collation_event_to_all_unbounded( | |||
} | |||
} | |||
|
|||
fn try_send_validation_event<E>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the name a bit misleading. The try
suggest that we don't wait when full - which is true, but we also don't give up. Maybe send_or_queue_validation_event?
@@ -1038,21 +1041,58 @@ fn dispatch_collation_event_to_all_unbounded( | |||
} | |||
} | |||
|
|||
fn try_send_validation_event<E>( | |||
event: E, | |||
sender: &mut (impl overseer::NetworkBridgeRxSenderTrait + overseer::SubsystemSender<E>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add the trait bound to the where
clause
} | ||
|
||
// Here we wait for all the delayed messages to be sent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also add a metric for the size of the delayed_messages
queue
I have tested it on Versi and it performed quite good with a cluster of malus nodes. However, I have not compared it's performance with the baseline due to clash with AB testing. With the metrics added, we can probably check that metrics and decide. |
Ideally we would indeed know the impact this has, so yes I would appreciate some A/B testing. |
Futures channels that are used by default has a side effect of `Sender::Clone` that efficiently increases the capacity of the bounded channel by one. This PR fixes the undesired backpressure removal that was caused by the #1409. This issue has been discovered by @sandreim during Versi testing and needs to be treated as critical that should not be included in any release without this reversion. This PR reverts the original behaviour.
This PR addresses multiple issues pending: