Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing consensus telemetry events from FinalizationConsumer and CommunicatorConsumer #5020

Merged
merged 3 commits into from
Nov 15, 2023

Conversation

AlexHentschel
Copy link
Member

@AlexHentschel AlexHentschel commented Nov 15, 2023

Metrika messaged us via slack saying that there is some lack of data in Mainnet 24:

none of the consensus nodes have the “proposer” label anymore?

Digging into the current telemetry events, I found that starting in Mainnet 24, the TelemetryConsumer captures

Digging into the code, I found that we are not subscribing the TelemetryConsumer to the respective events (neither does the TelemetryConsumer state that it implements the consumer interface for the respective events).

This PR fixes the missing subscription problem.

Back-ported to v0.32 branch: #5021

…all happy-path interfaces are implemented

• extended subscriptions of telemetry consumer to receive missing events
@codecov-commenter
Copy link

codecov-commenter commented Nov 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b05bb9c) 56.23% compared to head (68cf90e) 56.23%.
Report is 20 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5020   +/-   ##
=======================================
  Coverage   56.23%   56.23%           
=======================================
  Files         969      969           
  Lines       90362    90417   +55     
=======================================
+ Hits        50813    50848   +35     
- Misses      35764    35777   +13     
- Partials     3785     3792    +7     
Flag Coverage Δ
unittests 56.23% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@AlexHentschel
Copy link
Member Author

AlexHentschel commented Nov 15, 2023

This is a repeat of #4518, which I obviously forgot to merge 😭

I am trying to address Yurii's suggestion from an earlier PR here here. In a nutshell, Yurii's suggestion was to introduce a TelemetryConsumer interface that bundles the respective dependencies

type TelemetryConsumer interface {
	ParticipantConsumer
	CommunicatorConsumer
	FinalizationConsumer
	VoteCollectorConsumer
	TimeoutCollectorConsumer
}

I tried this and ran into the following challenges:

  • We could combine the interfaces
     ParticipantConsumer
     CommunicatorConsumer
     FinalizationConsumer
    as these are all sub-interfaces of the hotstuff.Consumer interface where the Distributor provides the respective pub-sub implementation.
  • However, the VoteCollectorConsumer and TimeoutCollectorConsumer interfaces are separate (neither part of hotstuff.Consumer interface nor distributed by Distributor)

I feel that implementing the suggestion would entail possibly wide-spread code changes:

  1. VoteCollector and TimeoutCollector produce events that the EventLoop ingests. Therefore, I would be inclined to keep the interfaces separate and not merge them into
    type Consumer interface {
    FollowerConsumer
    CommunicatorConsumer
    ParticipantConsumer
    }
  2. Given that the TelemetryConsumer would ingest events from Consumer, as well as VoteCollectorConsumer and TimeoutCollectorConsumer, I would be inclined to to add a new Top-Level interface.

@durkmurder I have tried to address your suggestion in PR #5022. But it entails a lot of changes. Personally, I feel the code becomes a bit cleaner, but that might be very well my personal perception (implementor bias).

…ts events from `hotstuff.ParticipantConsumer`
@durkmurder
Copy link
Member

@durkmurder I have tried to address your suggestion in PR #5022. But it entails a lot of changes. Personally, I feel the code becomes a bit cleaner, but that might be very well my personal perception (implementor bias).

I think you are right it's not worth to merge {Vote, Timeout}CollectorConsumer with Consumer interfaces, current design works good if we don't forget to subscribe everything(but that's a well known problem for pub-sub systems). Everything else looks great. Thanks for the fix.

Copy link
Member

@durkmurder durkmurder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@AlexHentschel AlexHentschel added this pull request to the merge queue Nov 15, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 15, 2023
@AlexHentschel AlexHentschel added this pull request to the merge queue Nov 15, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 15, 2023
@AlexHentschel AlexHentschel added this pull request to the merge queue Nov 15, 2023
Merged via the queue into master with commit aad2f84 Nov 15, 2023
54 checks passed
@AlexHentschel AlexHentschel deleted the alex/partially_missing_consensus_telemetry branch November 15, 2023 17:30
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants