Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admin: reject maintenance mode req on 1 node cluster #4921

Merged
merged 2 commits into from
May 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions src/v/cluster/members_table.cc
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,17 @@ members_table::apply(model::offset version, maintenance_mode_cmd cmd) {
return errc::success;
}

if (_brokers.size() < 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this change per se, but do we also need a similar check when applying a decommission command? Maybe an even heavier-handed one, eg don't decommission to below the default replication factor, or don't decommission if there are single-replica partitions hosted on the affected node?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decommission will only finish if there are enough nodes in the cluster to keep requested topics replication factor. for single node cluster the decommission will only finish if another node will be added to the cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrwng yes, see the comment above "there is potentially an issue with decoms...". Basically shrinking below 3 is already broken, and needs handling as a separate job.

// Maintenance mode is refused on size 1 clusters in the admin API, but
// we might be upgrading from a version that didn't have the validation.
vlog(
clusterlog.info,
"Dropping maintenance mode enable operation on single node cluster");

// Return success to enable progress: this is a clean no-op.
return errc::success;
}

if (
target->second->get_maintenance_state()
== model::maintenance_state::active) {
Expand Down
7 changes: 7 additions & 0 deletions src/v/redpanda/admin_server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1798,6 +1798,13 @@ void admin_server::register_broker_routes() {
throw ss::httpd::bad_request_exception(
"Maintenance mode feature not active (upgrade in progress?)");
}

if (
_controller->get_members_table().local().all_brokers().size() < 2) {
throw ss::httpd::bad_request_exception(
"Maintenance mode may not be used on a single node cluster");
}

model::node_id id = parse_broker_id(*req);
auto ec = co_await _controller->get_members_frontend()
.local()
Expand Down