-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node UUID map initialized from cluster discovery #7204
Node UUID map initialized from cluster discovery #7204
Conversation
"the cluster will proceed, however the cluster will have less actual " | ||
"seed nodes that expected, and may even fail to bootstrap", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought neither of these would be true with the changes in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those are irrelevant to this PR. There will be less seed nodes because the one with different avertised_rpc_api
will never act as a seed node (get_node_index_in_seed_servers()
will always return empty). The cluster will fail to bootstrap if the number of misconfigured nodes is at least a half of the total.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will never act as a seed node
Ah good point about a misconfigured majority. It's not really problematic for less than a majority, since as a non-founder the node would just join when a majority comes up.
Node UUID to ID map is collected from cluster_bootstrap_infos. Analysis of cluster_bootstrap_infos refactored into a single loop. Consistency checks extraced into a function. Added a check for node UUID uniqueness. Added a check for advertised_rpc_api matching seed address. Cleanup.
Node UUID map is stored to bootstrap_cluster_cmd. create_cluster() refactored to take cmd_data as an arg.
bootstrap_backend now references to members_manager for that.
1e6600c
to
e76fcf5
Compare
force-push: addressed comments, fixed the lack of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// the list of founding brokers and the node_uuid map. | ||
brokers founding_brokers; | ||
founding_brokers.reserve(config_seed_servers.size()); | ||
node_ids_by_uuid node_ids; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe call this node_uuid_map
or node_id_by_uuid
or something? Seems easy to conflate with unique_node_ids
.
// node_ids_by_uuid initialization | ||
// =============================== | ||
// If the node UUID to ID map in `members_manager` were not initialized at | ||
// cluster bootstrap, that would have allowed slow seed nodes (ones that miss | ||
// their oportunity to be founding nodes because of being slow) to also possibly | ||
// miss the assignment of `node_id` based on `seed_servers` index (as expected | ||
// for seed nodes). That could happen if another node joins the cluster before a | ||
// slow seed node and grabs the `node_id` that should have belonged to it. | ||
// | ||
// `bootstrap_backend` references `members_manager` to be able to initialize the | ||
// node UUID to ID map in it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IMO kinda odd placement; not sure folks would know to look for this context here. Maybe move this down by L146? or somewhere in the header?
@@ -2036,10 +2036,13 @@ struct bootstrap_cluster_cmd_data | |||
const bootstrap_cluster_cmd_data&, const bootstrap_cluster_cmd_data&) | |||
= default; | |||
|
|||
auto serde_fields() { return std::tie(uuid, bootstrap_user_cred); } | |||
auto serde_fields() { | |||
return std::tie(uuid, bootstrap_user_cred, node_ids_by_uuid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there have been a version bump here? @dlex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
22.3.1 is the first release with this message, so there isn't a compatibility concern. If this PR had landed in 22.3.2 we would've had to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coo
Cover letter
The node UUID to ID map in
members_manager
was not initialized until it was populated by cluster joiners. That has allowed for slow seed nodes (ones that miss their oportunity to be founding nodes because of being slow) to also possibly miss the assignment ofnode_id
based on theirseed_servers
index. That could happen if another node joins the cluster before a slow seed node and grabs thenode_id
that should have belonged to it.To prevent that, the node UUID to ID map needs to be initialized at the time of cluster bootstrap phase. This PR introduces
cluster_bootstap_info
cluster_bootstap_info
replies from the seedsbootstrap_cluster_cmd
members_manager
Fixes #333
Backport Required
UX changes
Release notes