Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v22.2.x] More robust waiting for the quiescent state in partition balancer tests #6032

Merged

Commits on Aug 15, 2022

  1. tests/partition_balancer: more robust wait for quiescent state

    Because unavailability timer resets every time the controller leader
    changes, robustly waiting for the timer to elapse is hard. Instead we
    simply wait until the unavailable node appears in the "violations"
    status field.
    
    (cherry picked from commit ce4a809)
    ztlpn authored and vbotbuildovich committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    15fec87 View commit details
    Browse the repository at this point in the history
  2. tests/partition_balancer: inline helper constant

    (cherry picked from commit 514f818)
    ztlpn authored and vbotbuildovich committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    3127f59 View commit details
    Browse the repository at this point in the history
  3. tests/partition_balancer: remove unused variable

    (cherry picked from commit 0946f39)
    ztlpn authored and vbotbuildovich committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    c36c06f View commit details
    Browse the repository at this point in the history
  4. tests/partition_balancer: inline rarely used helper functions

    (cherry picked from commit 220c958)
    ztlpn authored and vbotbuildovich committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    542262e View commit details
    Browse the repository at this point in the history
  5. admin_server: more logging in get_partition_balancer_status

    (cherry picked from commit a8f56b5)
    ztlpn authored and vbotbuildovich committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    68b54be View commit details
    Browse the repository at this point in the history
  6. tests/partition_balancer: more robust wait_until_status

    Previously, when the controller leader node was suspended during the
    test all status requests would fail with the timed-out error.
    This was true for all nodes, not just the suspended one (because we
    proxy the status request to the controller leader), so internal retries
    in the admin API wrapper didn't help. We increase the timeout and add
    504 to retriable status codes so that internal retries can handle this
    situation.
    
    (cherry picked from commit dc83a7b)
    ztlpn authored and vbotbuildovich committed Aug 15, 2022
    Configuration menu
    Copy the full SHA
    7ae38b0 View commit details
    Browse the repository at this point in the history