Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raft config update fix #364

Merged
merged 3 commits into from
Jan 11, 2021

Conversation

mmaslankaprv
Copy link
Member

Fixed replcate_whole_group test. The test failed because of an error in leader election process.

Fixes: #342

Checklist

When referencing a related issue, remember to migrate duplicate stories from the
external tracker. This is not relevant for most users.

Currently if node is no longer in configuration it may be promoted to
voter as `raft::configuration::is_voter()` return `false` when node is
not present in configuration - this is correct as node is not a voter.
Added explicit check to make sure that node is part of current
configuration before promoting it from learner to voter.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
Added updating `_target_priority` before skipping dispatching votes when
node is not a voter.

Learners are allowed to vote (not dispatch vote requests),
since learner may not be updated with information that it was promoted
while other nodes are. Learners ability to cast votes requires learners
to update the `_target_priority` value every time leader election wasn't
successful. The update was done when dispatching votes, but for learners
dispatching vote was skipped, hence the `_target_priority` wasn't updated
on learners. This might lead to situation where raft protocol couldn't
make progress since leader election would never succeed. Fixed the issue
by checking if node is a voter after updating `_target_priority`/

Signed-off-by: Michal Maslanka <michal@vectorized.io>
After fixing vote issues during configuration update, enabled back the
`replace_whole_group` test.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
// is voter already
if (config().is_voter(id)) {
if (latest_cfg.is_voter(id)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is possible that maybe_promote_to_voter is called for a node id that is already a voter, is it possible that it is called for a node id that is already a voter and is also not in the current configuration? would that need a special case like demotion to candidate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by demotion to candidate ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking back at this i don't think i have the same question now. i must have be confused about the combination of conditions.

@@ -500,8 +500,6 @@ bool consensus::should_skip_vote(bool ignore_heartbeat) {
}

skip_vote |= _vstate == vote_state::leader; // already a leader
skip_vote |= !_configuration_manager.get_latest().is_voter(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this change mean that vote_stm::vote would now allow a non-voter cast a vote?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-voters were always allow to cast a vote but not to request a vote. This behavior didn't change

@dotnwat dotnwat requested a review from rystsov January 5, 2021 02:12
@mmaslankaprv mmaslankaprv merged commit 441c8e7 into redpanda-data:dev Jan 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix raft replace_whole_group test
2 participants