Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heal network failures in the end of EndToEndFinjectorTest #6342

Merged
merged 2 commits into from
Sep 9, 2022

Conversation

ztlpn
Copy link
Contributor

@ztlpn ztlpn commented Sep 8, 2022

Cover letter

Previously we did not heal network failures when exiting tests that use EndToEndFinjectorTest base class (namely, AvailabilityTests). As possible failure types included netem failures, that meant that all subsequent tests in a test run used a node with crippled network, leading to flakiness.

The PR adds a teardown method that heals all introduced network failures including netem ones.

Fixes #5980

Backport Required

  • v22.2.x
  • v22.1.x

UX changes

none

Release notes

  • none

Previously we did not heal network failures when exiting tests that use
this base class (namely, AvailabilityTests). As possible failure types
included netem failures, that meant that all subsequent tests in a test
run used a node with crippled network, leading to flakiness.

Fixes redpanda-data#5980
@dotnwat
Copy link
Member

dotnwat commented Sep 9, 2022

@ztlpn this PR marks #5980 as fixed. but looking at that issue it seems to be related to PartitionBalancerTest.test_full_nodes and I'm not sure how that test would be related to AvailabilityTests and the EndToEndFinjectorTest...?

EDIT: nevermind, i missed that you wrote

that meant that all subsequent tests in a test run used a node with crippled network

nice find!

@dotnwat
Copy link
Member

dotnwat commented Sep 9, 2022

Do we need to remove an ok_to_fail now?

@ztlpn
Copy link
Contributor Author

ztlpn commented Sep 9, 2022

Do we need to remove an ok_to_fail now?

@dotnwat not yet - there is also #5884

@ztlpn ztlpn merged commit acf64f1 into redpanda-data:dev Sep 9, 2022
@ztlpn
Copy link
Contributor Author

ztlpn commented Sep 9, 2022

/backport v22.2.x

@ztlpn
Copy link
Contributor Author

ztlpn commented Sep 9, 2022

/backport v22.1.x

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Timeout waiting for partition balancer "ready" status in PartitionBalancerTest.test_full_nodes
3 participants