Properly handle node_id and seed_servers #91

vuldin · 2022-06-21T11:22:28Z

The helm chart will always set redpanda.seed_servers to be [] where redpanda.node_id is 0 (broker 0). I believe the issue is that broker 0 may not always be the leader after restarts or if there is a leader election, but the existing statefulset code still assumes broker 0 will be the leader (and sets seed_servers to []). There could also have been some issue with broker 0 that caused it to lose leadership, and so it could be in a state where it doesn't have a complete copy of all partitions. In this scenario, setting broker 0 to be the leader would result in data loss. See notes below for explanation as to why this is an issue and what we can do (both now and in versions >= 22.3) to resolve.

Investigation is needed into what happens once a a new leader is elected and then a helm upgrade is applied. ~~We should also determine the leader prior to restarting the cluster for whatever reason and set seed_server to [] for the appropriate broker.~~ Cluster restart should not impact seed_servers values, as they will be correctly set on all nodes, including the founding node (after the founding node is started).

Work is being done to allow setting the same seed_servers value across all brokers in a cluster, relevant ticket here: redpanda-data/redpanda#333

Also related to this, the following ticket tracks making node_id automatically assigned (and no longer set within redpanda.yaml): redpanda-data/redpanda#2793

~~Once the above tickets have associated PRs merged, we wouldn't have to worry about handling either node_id or seed_servers in the helm chart.~~ See notes below for how seed_servers will be handled in the future. For now we ensure the leader (or founding node) initially has it set to [] and then populate with other brokers after startup. After 22.3 we can set seed_servers for each node in the same way from the beginning.

The text was updated successfully, but these errors were encountered:

jcsp · 2022-08-12T15:54:53Z

This isn't quite about leadership or election.

Redpanda <= 22.2

The special thing about nodes with seed_servers=[] is how they behave if you start them on an empty drive. Nodes with seed_servers=[] (call it a "founding node") will respond to an empty drive by creating a new cluster of one node, and waiting for other nodes to join it. Nodes with a populated seed_servers have the opposite behavior on an empty drive: they will try to join an existing cluster, and no do anything until they succeed in doing so.

This becomes important in some cloud environments, where the disks aren't really persistent, and orcherstrators may be quite casual about just blowing away a node's drive, expecting the cluster to autonomously cope. This doesn't work, because if you blow away the disk from the founding node (the one with seed_servers=[]) then it won't try to rejoin the cluster: it'll start a new cluster of its own.

The way the latest operator code copes with this is to only briefly have seed_servers=[] starting the cluster for the first time, start 1 node like that, let it come up, and then immediately change its seed_servers to point to its peers so that if the node is ever restarted with an empty disk, it will rejoin rather than trying to found a new cluster.

Redpanda >= 22.3

The changes planned for 22.3 will provide a simpler way for things like the operator to initialize a cluster: there will be a mode where there is no auto-founding of clusters, and all nodes have seed_servers populated from time zero. Then the operator calls an admin API endpoint on one of the nodes (whichever, but one of them), and that node starts the cluster: starts a controller log and allows its peers to join. On subsequent disk wipes, there's no risk of anyone founding a new cluster, because there's no admin API call asking it to.

vuldin · 2022-08-12T18:54:57Z

Thanks for these details @jcsp , this will help when this ticket gets pulled in.

joejulian · 2022-10-11T21:15:00Z

Closing this as no chart changes will be needed for the solution in 22.3.

jcsp · 2022-10-11T21:30:51Z

Closing this as no chart changes will be needed for the solution in 22.3.

Really? I thought the 22.3 core code was going to be broadly backwards compatible, so without changes to the chart I'd have thought you'd still have an issue (i.e. nodes with seed_servers=[] would come up and form a cluster of 1 if you wiped their drive)

joejulian · 2022-10-11T23:16:53Z

we don't set the seed servers to an empty list:

        seed_servers:
          - host:
              address: "redpanda-0.redpanda.redpanda.svc.cluster.local."
              port: 33145
          - host:
              address: "redpanda-1.redpanda.redpanda.svc.cluster.local."
              port: 33145
          - host:
              address: "redpanda-2.redpanda.redpanda.svc.cluster.local."
              port: 33145

joejulian · 2022-10-12T18:21:39Z

Thanks, @jcsp, for asking the right questions. We do need to remove the config set redpanda.seed_servers from an init container when 22.3 comes out.

vuldin added the enhancement New feature or request label Jul 5, 2022

joejulian closed this as completed Oct 11, 2022

joejulian reopened this Oct 12, 2022

joejulian closed this as completed Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly handle node_id and seed_servers #91

Properly handle node_id and seed_servers #91

vuldin commented Jun 21, 2022 •

edited

Loading

jcsp commented Aug 12, 2022 •

edited

Loading

vuldin commented Aug 12, 2022

joejulian commented Oct 11, 2022

jcsp commented Oct 11, 2022

joejulian commented Oct 11, 2022

joejulian commented Oct 12, 2022

Properly handle node_id and seed_servers #91

Properly handle node_id and seed_servers #91

Comments

vuldin commented Jun 21, 2022 • edited Loading

jcsp commented Aug 12, 2022 • edited Loading

Redpanda <= 22.2

Redpanda >= 22.3

vuldin commented Aug 12, 2022

joejulian commented Oct 11, 2022

jcsp commented Oct 11, 2022

joejulian commented Oct 11, 2022

joejulian commented Oct 12, 2022

vuldin commented Jun 21, 2022 •

edited

Loading

jcsp commented Aug 12, 2022 •

edited

Loading