Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster: drop redundant set_status config updates #4924

Merged
merged 2 commits into from
May 27, 2022

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented May 25, 2022

Cover letter

cluster: drop redundant set_status config updates

If a follower isn't seeing controller log updates
promptly, it may issue many set_status RPCs while
it's waiting.  The controller leader should not
turn all of these into log writes: if the status
of the node already matches what it is reporting,
then do not write anything.


cluster: wait between config set_status RPCs

On a healthy system, we do want to send set_status
RPCs as soon as we're ready.  However, if the controller
log updates are not being seen promptly, this would lead
to the follower spamming the controller leader with
very many set_status RPCs in a tight loop.

Nodes will still send their status immediately when
a config change occurs: this change only effects the
behaviour if _another_ config change occurs while
it is reporting status from the first change: in this
case the follower will wait 5 seconds before sending
its next status RPC.

Fixes #4923

Release notes

Improvements

  • Reduced frequency of configuration status RPCs when cluster is in a degraded state

jcsp added 2 commits May 25, 2022 12:10
If a follower isn't seeing controller log updates
promptly, it may issue many set_status RPCs while
it's waiting.  The controller leader should not
turn all of these into log writes: if the status
of the node already matches what it is reporting,
then do not write anything.

Fixes redpanda-data#4923
On a healthy system, we do want to send set_status
RPCs as soon as we're ready.  However, if the controller
log updates are not being seen promptly, this would lead
to the follower spamming the controller leader with
very many set_status RPCs in a tight loop.

Nodes will still send their status immediately when
a config change occurs: this change only effects the
behaviour if _another_ config change occurs while
it is reporting status from the first change: in this
case the follower will wait 5 seconds before sending
its next status RPC.

Related redpanda-data#4923
@jcsp jcsp added kind/bug Something isn't working area/controller labels May 25, 2022
@jcsp
Copy link
Contributor Author

jcsp commented May 26, 2022

TopicRecoveryTest.test_fast2 is likely a variant of #4886, but certainly unrelated to this change.

@jcsp jcsp marked this pull request as ready for review May 26, 2022 09:02
Copy link
Member

@mmaslankaprv mmaslankaprv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, this will need a backport to v22.1.x

@jcsp jcsp merged commit 3810e04 into redpanda-data:dev May 27, 2022
@jcsp jcsp deleted the issue-4923-config-status-spam branch May 27, 2022 17:48
@jcsp
Copy link
Contributor Author

jcsp commented May 27, 2022

/backport v22.1.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cluster: many redundant config status log messages may be written if follower is not seeing updates promptly
2 participants