Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix raft replace_whole_group test #342

Closed
mmaslankaprv opened this issue Dec 23, 2020 · 1 comment · Fixed by #364 or #5696
Closed

fix raft replace_whole_group test #342

mmaslankaprv opened this issue Dec 23, 2020 · 1 comment · Fixed by #364 or #5696
Assignees
Labels
area/raft ci-failure kind/bug Something isn't working

Comments

@mmaslankaprv
Copy link
Member

The replace_whole_group test is failing in CI. We have to investigate to find out the reason and fix it.

@mmaslankaprv mmaslankaprv added area/raft kind/bug Something isn't working labels Dec 23, 2020
@mmaslankaprv mmaslankaprv self-assigned this Dec 23, 2020
@mmaslankaprv mmaslankaprv mentioned this issue Jan 4, 2021
2 tasks
@dotnwat
Copy link
Member

dotnwat commented Jul 22, 2022

https://buildkite.com/redpanda/redpanda/builds/12913#01822533-b065-455a-88b4-80ede686f06a/6-5021

../../../src/v/raft/tests/raft_group_fixture.h(575): �[4;31;49mfatal error: in "replace_whole_group": Timeout elapsed while wating for: new nodes are up to date�[0;39;49m

@dotnwat dotnwat reopened this Jul 22, 2022
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Jul 28, 2022
Fixed `replace_whole_group` test by waiting for the whole raft group to
be up to date before executing configuration change.

The test was flaky as sometimes the configuration was changed before
propagating any records to one of the nodes (raft correctness is based
on majority). This caused the condition validating if some batches are
read to fail permanently.

Fixes: redpanda-data#342

Signed-off-by: Michal Maslanka <michal@redpanda.com>
andrwng pushed a commit to andrwng/redpanda that referenced this issue Aug 5, 2022
Fixed `replace_whole_group` test by waiting for the whole raft group to
be up to date before executing configuration change.

The test was flaky as sometimes the configuration was changed before
propagating any records to one of the nodes (raft correctness is based
on majority). This caused the condition validating if some batches are
read to fail permanently.

Fixes: redpanda-data#342

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/raft ci-failure kind/bug Something isn't working
Projects
None yet
2 participants