Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consensus Compliance Engine updated to conform to Component interface #3121

Merged

Conversation

jordanschalm
Copy link
Member

@jordanschalm jordanschalm commented Aug 29, 2022

This PR updates consensus/compliance engine to implement component.Component.

The epochmgr and collection/compliance engine will be updated in a subsequent PR.

Other Changes

  • Replaces type-unsafe communication between compliance, sync, and provider engines with specific interfaces
  • Changes provider engine from a deprecated Engine to a MessageProcessor
  • Refactors Stopper slightly to work with context cancellation

engine/consensus/compliance/engine.go Outdated Show resolved Hide resolved
engine/consensus/provider/engine.go Outdated Show resolved Hide resolved
@jordanschalm jordanschalm changed the title Compliance Engines updated to conform to Component interface Consensus Compliance Engine updated to conform to Component interface Aug 30, 2022
@codecov-commenter
Copy link

codecov-commenter commented Sep 1, 2022

Codecov Report

Merging #3121 (97d3714) into feature/active-pacemaker (1da5cdc) will decrease coverage by 1.62%.
The diff coverage is 61.07%.

@@                     Coverage Diff                      @@
##           feature/active-pacemaker    #3121      +/-   ##
============================================================
- Coverage                     54.84%   53.21%   -1.63%     
============================================================
  Files                           487      694     +207     
  Lines                         39550    65832   +26282     
============================================================
+ Hits                          21692    35035   +13343     
- Misses                        16068    27870   +11802     
- Partials                       1790     2927    +1137     
Flag Coverage Δ
unittests 53.21% <61.07%> (-1.63%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
engine/enqueue.go 71.42% <ø> (ø)
utils/unittest/irrecoverable.go 0.00% <0.00%> (ø)
utils/unittest/unittest.go 7.60% <0.00%> (ø)
engine/consensus/compliance/core.go 74.60% <41.66%> (ø)
engine/consensus/provider/engine.go 34.32% <42.10%> (ø)
engine/common/synchronization/engine.go 72.76% <62.50%> (ø)
engine/collection/compliance/engine.go 63.56% <66.66%> (ø)
engine/consensus/compliance/engine.go 61.41% <77.77%> (ø)
engine/collection/compliance/core.go 80.73% <100.00%> (ø)
engine/common/follower/engine.go 52.47% <100.00%> (ø)
... and 329 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

consensus/integration/integration_test.go Outdated Show resolved Hide resolved
consensus/integration/nodes_test.go Outdated Show resolved Hide resolved
consensus/integration/stopper_test.go Outdated Show resolved Hide resolved
consensus/integration/stopper_test.go Outdated Show resolved Hide resolved
consensus/integration/stopper_test.go Show resolved Hide resolved
engine/consensus/compliance/engine.go Outdated Show resolved Hide resolved
err = cs.engine.BroadcastProposalWithDelay(header, 0)
require.Error(cs.T(), err, "should fail with missing payload")
header.View--
<-util.AllClosed(broadcasted, submitted)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<-util.AllClosed(broadcasted, submitted)
unittest.AssertClosesBefore(cs.T(), util.AllClosed(broadcasted, submitted), time.Second)

require.Error(cs.T(), err, "should fail with missing payload")
header.View--
<-util.AllClosed(broadcasted, submitted)
cs.cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here, I don't think we need to stop manually

return cs.engine.pendingVotes.(*engine.FifoMessageStore).Len() == 0
}, time.Second, time.Millisecond*10)
// stop the engine
cs.cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to stop manually?


allViews := allFinalizedViews(t, nodes)
for i, views := range allViews {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for debugging ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - will remove this...

* catch irrecoverable errors in separate goroutine
* don't cancel engine context with individual cases
@@ -149,6 +149,13 @@ func AssertReturnsBefore(t *testing.T, f func(), duration time.Duration, msgAndA
}
}

// ClosedChannel returns a closed channel.
func ClosedChannel() <-chan struct{} {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice.

Comment on lines 17 to 18
// MessageStore is the interface to abstract how messages are buffered in memory before
// being handled by the engine
// being handled by the engine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// MessageStore is the interface to abstract how messages are buffered in memory before
// being handled by the engine
// being handled by the engine.
// MessageStore is the interface to abstract how messages are buffered in memory
// while waiting to be processed.

engine/enqueue.go Outdated Show resolved Hide resolved
Comment on lines 49 to 50
mempool module.MempoolMetrics
metrics module.EngineMetrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally find the variable naming here a bit ambiguous, because mempool is not an actual mempool but the metrics collector for mempools. I think it would improve clarity, if this was reflected in the variable name.

Suggested change
mempool module.MempoolMetrics
metrics module.EngineMetrics
mempoolMetrics module.MempoolMetrics
componentMetrics module.EngineMetrics

would suggest to also rename the corresponding variables in compliance.Core respectively

engine/consensus/compliance/engine.go Show resolved Hide resolved
@@ -47,11 +51,17 @@ func (s *Stopper) AddNode(n *Node) *StopperConsumer {
s.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add a brief goDoc to the AddNode method please? In particular, a comment regarding the intended usage of the return value would be great. To be honest, I don't understand why we return this consumer here and I also don't understand why it is called a StopperConsumer. Is this legacy code? I feel we could just remove it ... but not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - as far as I can tell the StopperConsumer does nothing at all. Going to remove it.

stopConsumer := &StopperConsumer{}
return stopConsumer
}

// WithStopFunc adds a function to use to stop all nodes (typically the cancel function of the context used to start them).
func (s *Stopper) WithStopFunc(stop func()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not concurrency safe, which should be highlighted in the goDoc.

@@ -454,15 +434,15 @@ func (e *Engine) BroadcastTimeout(timeout *model.TimeoutObject) error {
return
}
if err != nil {
log.Error().Err(err).Msg("could not broadcast timeout")
log.Err(err).Msg("could not broadcast timeout")
return
}
log.Info().Msg("consensus timeout broadcast")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure, but my understanding is that the commonly used past tense of "broadcast" is also "broadcast". I think it would avoid ambiguity, if we clarify that the broadcast has completed:

Suggested change
log.Info().Msg("consensus timeout broadcast")
log.Info().Msg("consensus timeout was broadcast")

similarly here, I would suggest to use a consistent working

// submit to broadcast proposal
err := cs.engine.BroadcastProposalWithDelay(block.Header, 0)
require.NoError(cs.T(), err, "header broadcast should pass")
// unset chain and height to make sure they are correctly reconstructed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we put lines 141 - 172 into their own cs.Run method to increase the structure of the test?

// TestOnFinalizedBlock tests if finalized block gets processed when send through `Engine`.
// Tests the whole processing pipeline.
func (cs *ComplianceSuite) TestOnFinalizedBlock() {
func (cs *EngineSuite) TestOnFinalizedBlock() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my learning: is the old implementation (below) actually buggy or just unnecessarily verbose?

require.Eventually(cs.T(),
func() bool {
return cs.pending.AssertCalled(cs.T(), "PruneByView", finalizedBlock.View)
}, time.Second, time.Millisecond*20)

Comment on lines +110 to +111
if e.me.NodeID() != proposal.Header.ProposerID {
return fmt.Errorf("sanity check failed: attempted to provide another node's proposal (%x!=%x)", e.me.NodeID(), proposal.Header.ProposerID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sanity check is also in the compliance Engine:

// first, check that we are the proposer of the block
if header.ProposerID != e.me.NodeID() {
return fmt.Errorf("cannot broadcast proposal with non-local proposer (%x)", header.ProposerID)
}

do we need it in both places?
... the check is probably cheap enough that it makes no practical difference, so why not 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm more inclined to just leave it

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the nice refactoring, the great documentation and the test revisions.

In response to Yurii's comment, I have created PR #3243 (targeting this branch) that adds the missing error documentation.

Copy link
Member

@durkmurder durkmurder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great effort.

Alexander Hentschel and others added 10 commits September 16, 2022 15:24
* • documented expected error returns in compliance.Core and compliance.Engine
• compactified code by removing unnecessary line breaks

* Update engine/consensus/compliance/core.go

Co-authored-by: Jordan Schalm <jordan@dapperlabs.com>
Co-authored-by: Alexander Hentschel <alex.hentschel@axiomzen.co>
@jordanschalm
Copy link
Member Author

bors merge

@bors bors bot merged commit b4c2fcc into feature/active-pacemaker Sep 20, 2022
@bors bors bot deleted the jordan/6263-compliance-component-pattern branch September 20, 2022 14:41
bors bot added a commit that referenced this pull request Oct 4, 2022
3248: Collection`compliance`, `epochmgr` Engines Implement `component.Component` r=jordanschalm a=jordanschalm

⚠️ Depends on #3121

This PR updates `collection/compliance.Engine` and `collection/epochmgr.Engine` to implement `component.Component` (#3121, but for collection nodes).

Co-authored-by: Jordan Schalm <jordan@dapperlabs.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants