Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'maintenance mode' for redpanda nodes #2093

Closed
jcsp opened this issue Aug 18, 2021 · 3 comments
Closed

Add 'maintenance mode' for redpanda nodes #2093

jcsp opened this issue Aug 18, 2021 · 3 comments
Labels
area/redpanda kind/enhance New feature or request

Comments

@jcsp
Copy link
Contributor

jcsp commented Aug 18, 2021

Now that we're doing more complicated cluster management features like leadership rebalancing, it becomes increasingly troublesome to have unresponsive nodes in the system:

  • they disrupt or break collective algorithms, e.g. leadership rebalancing sees down nodes as nodes with no leaders & therefore great places to attempt to migrate leadership to.
  • they generate log noise from errors connecting

When the administrator is intentionally stopping nodes, we should let them inform the cluster and thereby avoid these issues.

When entering maintenance mode, nodes should give up leaderships (an abdicate admin API is added in #1936).

In raft, we should still send heartbeats to nodes in maintenance mode, to enable them to catch up before being brought back into normal service. However, it would be nice to avoid emitting connection errors to the log for nodes in maintenance mode -- this might be something to implement in the RPC layer.

@jcsp jcsp added kind/enhance New feature or request area/redpanda labels Aug 18, 2021
@jcsp
Copy link
Contributor Author

jcsp commented Aug 19, 2021

Related: #2092

@jcsp
Copy link
Contributor Author

jcsp commented Nov 19, 2021

Duplicated by #3020

@jcsp jcsp closed this as completed Nov 19, 2021
@dotnwat
Copy link
Member

dotnwat commented Nov 19, 2021

Duplicated by #3020

Oh, thanks I didn't think to search first :/ I think there are some bits in here worth moving over to 3020 too. I can do that.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/redpanda kind/enhance New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants