-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator: use pod annotation for node_id and avoid mixing maintenance mode with decommissioning #7581
Conversation
bcba247
to
2d1e2a3
Compare
2d1e2a3
to
b588c51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, no blockers but couple of nits
pretty interesting we did not have to touch much tests 😅
// "selector" are unavailable after the eviction, i.e. even in absence of | ||
// the evicted pod. For example, one can prevent all voluntary evictions | ||
// the evicted Pod. For example, one can prevent all voluntary evictions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just curious why is Pod capital? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k8s resources are proper nouns. Making this distinction helps comprehension, especially with Node and Service. I forget why this caught my attention but I did put it in its own commit in case it was hated 😁
src/go/k8s/cmd/configurator/main.go
Outdated
@@ -281,7 +279,7 @@ func registerAdvertisedKafkaAPI( | |||
} | |||
|
|||
func registerAdvertisedPandaproxyAPI( | |||
c *configuratorConfig, cfg *config.Config, index brokerID, proxyAPIPort int, | |||
c *configuratorConfig, cfg *config.Config, index int, proxyAPIPort int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this index is podOrdinal? Would. be nice to use just one name for the thing to avoid confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same thing but was concerned about blowing up scope since this was originally wanted within the day. I think I've addressed this.
@@ -414,7 +414,7 @@ func isSafeToRestart( | |||
) bool { | |||
configVersions := make(map[int64]bool) | |||
for i := range clusterStatus { | |||
log.Info(fmt.Sprintf("Node %d is using config version %d", clusterStatus[i].NodeID, clusterStatus[i].ConfigVersion)) | |||
log.Info(fmt.Sprintf("Node %d is using config version %d", clusterStatus[i].NodeID, clusterStatus[i].ConfigVersion)) // actually pod ordinal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should all this then say pod with ordinal X is using....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was wrong. Removed the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have problem with recommission logic and new feature gate. Other than that LGTM
@@ -224,25 +228,30 @@ func (r *StatefulSetResource) handleRecommission(ctx context.Context) error { | |||
return err | |||
} | |||
|
|||
broker, err := getNodeInfoFromCluster(ctx, *r.pandaCluster.Status.DecommissioningNode, adminAPI) | |||
broker, err := getNodeInfoFromCluster(ctx, targetOrdinal, adminAPI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is problem with this call.
redpanda/src/go/k8s/pkg/resources/statefulset_scale.go
Lines 354 to 368 in ff319d3
// getNodeInfoFromCluster allows to get broker information using the admin API | |
func getNodeInfoFromCluster( | |
ctx context.Context, ordinal int32, adminAPI adminutils.AdminAPIClient, | |
) (*admin.Broker, error) { | |
brokers, err := adminAPI.Brokers(ctx) | |
if err != nil { | |
return nil, fmt.Errorf("could not get the list of brokers for checking decommission: %w", err) | |
} | |
for i := range brokers { | |
if brokers[i].NodeID == int(ordinal) { | |
return &brokers[i], nil | |
} | |
} | |
return nil, nil | |
} |
The second argument is the Pod ordinal. The for loop will look for node with wrong ID. I think you should swap the getPodNodeIDFromAnnotation
with getNodeInfoFromCluster
and change targetOrdinal
with brokerID
as second argument in getNodeInfoFromCluster
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have e2e test that verifies that recommission broker is called. Maybe unit test would be enough to simulate 2 disjoint collections of ids:
- POD ordinals - 0,1,2
- Broker ID - 7, 8, 9
Unit test would need to assert that RecommissionBroker was called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just one item, otherwise lgtm
// the last number is the index of replica. This index is then propagated | ||
// to redpanda.node_id. | ||
func hostIndex(hostName string) (brokerID, error) { | ||
// getPodOrdinalFromHostname takes advantage of pod naming convention in Kubernetes StatfulSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// getPodOrdinalFromHostname takes advantage of pod naming convention in Kubernetes StatfulSet | |
// getPodOrdinalFromHostname takes advantage of Pod naming convention in Kubernetes StatefulSet |
closing as it's horribly out-of-date and needs to be redone (if at all). |
There were places where the pod ordinal was being used for the node_id (broker id). With v22.3.x those numbers may not match. By using the pod annotation, we can use the correct node_id.
This strategy doesn't work, however, when mixing decommission and maintenance mode. The preStop script would always set maintenance mode, even if the node was decommissioned, or in the process of being decommissioned. This avoids the need for running that workaround by checking in the preStop script and not setting maintenance mode if decommissioned/decommissioning.
Fixes #7541
Backports Required
UX Changes
N/A
Release Notes
Bug Fixes