Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster: Stale leadership in partition_leaders_table on leader (failure in ClusterConfigTest.test_restart) #3486

Closed
jcsp opened this issue Jan 14, 2022 · 2 comments · Fixed by #3487
Assignees
Labels

Comments

@jcsp
Copy link
Contributor

jcsp commented Jan 14, 2022

This is manifesting as a failure of ClusterConfigTest.test_restart because that test checks for convergence of config versions, and config versions only get updated if nodes can see a controller leader. This might be destabilizing other tests too, if they have timeouts that rely on controller leader being available within a certain time.

https://buildkite.com/vectorized/redpanda/builds/6142#3eb5ad1e-c519-4c16-b39c-5355ff4cf590

After an election has succeeded, the metadata dissemination service's ticker is still using the content of node health reports to set leadership. If the last node in the list of node health reports is saying leader=null, then this continuously overrides the local partition leader table until the next round of health reports come in.

This behavior was introduced in #3355


commit c8f4f12ae88dafab7f26b1e99e4711b3fa39642f
Author: Michal Maslanka <michal@vectorized.io>
Date:   Wed Jan 5 13:18:31 2022 +0100

    c/dissemination: use health manager information to update leaders
@jcsp jcsp added kind/bug Something isn't working area/controller labels Jan 14, 2022
jcsp added a commit to jcsp/redpanda that referenced this issue Jan 14, 2022
This regressed in c8f4f12

Node health reports may disagree with one another about
leadership in a particular term, if some of them claim
that it's null (because they've seen the term in their
own log after restart, but not yet received an append_entries
from the leader).

To avoid a rogue node health report resetting the leadership
of a topic to null, ignore health report leadership
information if it claims a null leader.

Non-null claims are always believable, because of the term:
if they're out of date, then they were still correct for
the term they claim, and we ignore those out of date
terms in partition_leaders_table::update_partition_leader.

Fixes redpanda-data#3486
@jcsp jcsp self-assigned this Jan 14, 2022
@gousteris
Copy link
Contributor

@jcsp
Copy link
Contributor Author

jcsp commented Jan 17, 2022

@gousteris no, that failure is from before this merged.

mmaslankaprv pushed a commit to mmaslankaprv/redpanda that referenced this issue Jan 27, 2022
This regressed in c8f4f12

Node health reports may disagree with one another about
leadership in a particular term, if some of them claim
that it's null (because they've seen the term in their
own log after restart, but not yet received an append_entries
from the leader).

To avoid a rogue node health report resetting the leadership
of a topic to null, ignore health report leadership
information if it claims a null leader.

Non-null claims are always believable, because of the term:
if they're out of date, then they were still correct for
the term they claim, and we ignore those out of date
terms in partition_leaders_table::update_partition_leader.

Fixes redpanda-data#3486

(cherry picked from commit 1335dff)
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants