Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator: add support for downscaling #5019

Merged
merged 18 commits into from
Jun 21, 2022

Commits on Jun 16, 2022

  1. operator: split readyReplicas from replicas and add currentReplicas

    Now status fields have the following behaviour:
    - replicas: reflects StatefulSet status replicas (no longer readyReplicas)
    - readyReplicas: reflects StatefulSet status readyReplicas
    - currentReplicas: managed by the operator to dynamically change the current number of replicas, to gradually match user expectations
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    2711976 View commit details
    Browse the repository at this point in the history
  2. operator: add decommissioningNode status field

    When decommissioning a node, the field is populated with the ordinal number of the node being decommissioned. In case of recommission, it also indicates the node being currently recommissioned.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    8b3559d View commit details
    Browse the repository at this point in the history
  3. operator: change webhook to allow decommissioning

    The replicas field can freely change and the controller will make sure that nodes are properly decommissioned. The only remaining restriction is that replicas cannot be 0 or nil.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    0131182 View commit details
    Browse the repository at this point in the history
  4. operator: move types to their own package to avoid dependency loop

    This allows the pkg/resources package to use the admin API internal interface.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    d6dac46 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e9ecea7 View commit details
    Browse the repository at this point in the history
  6. operator: allow scoping internal admin API to specific nodes

    This allows to get local information from brokers, such as the local configuration.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    f8a7bcd View commit details
    Browse the repository at this point in the history
  7. operator: remove stack trace from logs when delay is requested

    This produced a stacktrace in the logs, while waiting for a condition.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    1913822 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    33c1034 View commit details
    Browse the repository at this point in the history
  9. operator: add scale handler to properly decommission and recommission…

    … nodes
    
    This adds a handler that correctly manages upscaling and downscaling the cluster, decommissioning nodes wheh needed.
    
    The handler uses `status.currentReplicas` to signal the amount of replicas that all subcontrollers should materialize.
    When a cluster is downscaled, the handler first tries to decommission the last node via admin API, then decreases the value of `status.currentReplicas`, to remove the node only when the cluster allows it.
    
    In case the cluster refuses to decommission a node (e.g. min replicas on a topic higher than the desired number of nodes), the user can increase `spec.replicas` to trigger a recommission of the node.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    4d74d4f View commit details
    Browse the repository at this point in the history
  10. operator: implement progressive initialization to let node 0 create i…

    …nitial raft group
    
    This tries to solve the problem with empty seed_servers on node 0. With this change, all fresh clusters will be initially set to 1 replica (via `status.currentReplicas`), until a cluster is created and the operator can verify it via admin API. Then the cluster is scaled to the number of instances desired by the user.
    
    After the cluster is initialized, and for the entire lifetime of the cluster, the `seed_servers` property will be populated with the full list of available servers, in every node of the cluster.
    
    This overcomes redpanda-data#333. Previously, node 0 was always forced to have an empty seed_servers property, but this caused problems when it lost the data dir, as it tried to create a brand-new cluster. With this change, even if node 0 loses the data dir, the seed_servers property will always point to other nodes, so it will try to join the existing cluster.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    5d62c76 View commit details
    Browse the repository at this point in the history
  11. operator: consider draining field when checking maintenance mode status

    Since nodes are auto-draining as part of their shutdown hooks, it happens that when maintenance mode is activated for a decommissioned node, no process is really started. We just exit if that is the case.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    d48e957 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    2b02d34 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    3b7b77e View commit details
    Browse the repository at this point in the history
  14. operator: disable maintenance mode hook on node 0 when starting up a …

    …fresh cluster
    
    This allows to have predictable initial cluster formation. When the cluster is first created, it's composed of a single node. On single-node clusters, we should not activate maintenance mode, because, otherwise, a restart of the node will make it drain leadership and the cluster will not form.
    
    On the counter-side, enabling maintenance mode when the cluster scales to multiple instances currently causes a restart of node 0. This will be solved when implementing dynamic hooks.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    e9bba61 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    90812c1 View commit details
    Browse the repository at this point in the history
  16. operator: fix maintenance mode activation on decommissioning node (wo…

    …rkaround for redpanda-data#4999)
    
    When a node is shutdown after decommission, the maintenance mode hooks will trigger. While the process has no visible effect on partitions, it leaves the cluster in an inconsistent state, so that other nodes cannot enter maintenance mode. We force reset the flag with this change.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    a9efa0a View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    b1e3fac View commit details
    Browse the repository at this point in the history
  18. operator: mark downscaling as alpha feature and add a startup flag

    We should enable downscaling as feature gate when issue with reusable node IDs is fixed.
    nicolaferraro committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    5bd42df View commit details
    Browse the repository at this point in the history