-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seeds Driven Cluster Bootstrap #6744
Conversation
e816ecc
to
402bf3a
Compare
Force-push update:
|
Can you explaing what SDCB and ESCB mean ? Maybe adding this to cover letter would help |
402bf3a
to
423e510
Compare
force-push: rebased onto |
cluster_test_fixure now supports the empty_seed_starts_cluster parameter
525596e
to
609ebda
Compare
force push: review comments addressed, |
bootstrap_backend operates on credentail_storage for that. In security_frontend, maybe_create_bootstrap_user() is changed into get_bootstrap_user_creds_from_env(), and the actual bootstrap user creation is not done anymore.
Client: parallel querying of all seed_servers w/o timeout, until results are obtained from all peer seed servers. Verify that both versions and configurations match Server: supply data initial_seed_brokers() now returns a future
cluster_discovery caches if cluster_uuid is present is_cluster_founder() to determine if the node should be starting a cluster
Cluster founders need to be able to elect a leader so they can decide which node replicates the cluster_bootstrap_cmd, and do that before controller starts.
This commit adds options to the RedpandaService class to: - change the set of seed servers - change whether or not node idx 1 is deemed the root node
Peer seed nodes discovered throught cluster_discovery before the cluster is bootstrapped are required to be at the same latest logical versions as the local. Therefore as the cluster is initialized, versions of seed nodes can safely be assumed to be at the latest. That enables the auto node_id feature that is essential for cluster bootstrap.
609ebda
to
50840ed
Compare
force push: fix a linter error |
cluster_discovery::get_cluster_founder_node_id() { | ||
if (config::node().empty_seed_starts_cluster()) { | ||
if (config::node().seed_servers().empty()) { | ||
return node_id{0}; | ||
co_return node_id{0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we validate that in this case node_id configured in redpanda.yml
matches the one returned from here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, I agree that it should be added
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
The original implementation of seeds-driven bootstrap had pieces of the various validation required for bootstrap strewn about startup. Because of this, I found it difficult to reason about what validations are done when, and what work may be duplicated, while reading through the code. This commit puts all validations up front so we determine whether we are a founder immediately and use a cached value thereafter. This is mostly addressing review comments on redpanda-data#6744.
@@ -73,7 +73,8 @@ feature_manager::feature_manager( | |||
|
|||
) {} | |||
|
|||
ss::future<> feature_manager::start() { | |||
ss::future<> | |||
feature_manager::start(std::vector<model::node_id>&& cluster_founder_nodes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please take by value
Cover letter
This PR addresses the second part of the scope of the "Cluster Bootstrap" project. The first part was addressed by #6659.
This PR introduces a new node configuration parameter
empty_seed_starts_cluster
that can disable the Empty Seed Cluster Bootstrap AKA Root Driven Cluster Bootstrap (legacy) mode of bootstrapping a cluster, and allows the set of servers listed as seeds in each node configuration to start a cluster together, as soon as they form a raft group and elect a leader. The new bootstrapping mode is referred as Seeds Driven Cluster Bootstrap.In either mode, cluster now gets cluster UUID reflected by a new controller log message, which lands second in the controller log right after the initial raft configuration message. Cluster UUID is also stored in kvstore of shard0 in every node.
In the new Seeds Driven bootstrap mode, all seed servers must be available for a cluster to be created, with identical node configurations. Afterwards, none of seed servers should try to form another new cluster ("split-brain") if their local storage is wiped out, unless all seed nodes are wiped together at the same time.
Fixes #333
Backport Required
UX changes
Node Configuration in redpanda.yaml
empty_seed_starts_cluster
(default: true) to switch between the Empty Seed Cluster Bootstrap (legacy) mode, and the Seeds Driven Cluster Bootstrap mode. When disabled, it is required to be disabled in all seed nodes.seed_servers
are required to be identical in every seed node when in the Seeds Driven Cluster Bootstrap mode.Release notes
Features
empty_seed_starts_cluster
to use it. That will allow the set of servers listed as seeds to start a cluster together. All seed servers must be available for a cluster to be created, with identical node configurations. Afterwards, none of seed servers will try to form another new cluster if their local storage is wiped out, unless all seed nodes are wiped together at the same time. Cluster now gets cluster UUID reflected by a new controller log message, and stored in kvstore.Improvements