Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure EndpointSlice exist if Endpoint is found #84421

Merged
merged 1 commit into from
Oct 31, 2019

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Oct 27, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
The EndpointSlice for masters was not created after enabling EndpointSlice feature on a pre-existing cluster. This was because the Endpoint object had been created and ReconcileEndpoints would skip creating or updating it after EndpointSlice feature is enabled.

This patch ensures EndpointSlice is consistent with Endpoints after the reconciler reconciles Endpoints even if Endpoints is unchanged. It also avoids an update if the desired EndpointSlice matches the existing one.

Which issue(s) this PR fixes:

Fixes #84419

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fix the bug that EndpointSlice for masters wasn't created after enabling EndpointSlice feature on a pre-existing cluster.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 27, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @tnqn. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 27, 2019
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 27, 2019
@tnqn
Copy link
Member Author

tnqn commented Oct 27, 2019

/assign @robscott

@robscott
Copy link
Member

Hey @tnqn, thanks for catching this! Looks like a good fix on first glance, will try to take a closer look at this tomorrow.
/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 27, 2019
@robscott
Copy link
Member

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 27, 2019
@tnqn
Copy link
Member Author

tnqn commented Oct 28, 2019

@robscott thanks for reviewing it.
/retest

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tnqn, this looks great, just have a suggestion on how to simplify the tests you're adding. I spun up an e2e cluster with this branch and the upgrade worked as expected simply by toggling the EndpointSlice feature flag on the apiserver manifest. Thanks for the fix!

@@ -98,6 +108,28 @@ func TestEndpointsAdapterGet(t *testing.T) {
if !apiequality.Semantic.DeepEqual(endpoints, testCase.expectedEndpoints) {
t.Errorf("Expected endpoints: %v, got: %v", testCase.expectedEndpoints, endpoints)
}

epSliceList, err := client.DiscoveryV1alpha1().EndpointSlices(testCase.namespaceParam).List(metav1.ListOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this test logic can actually be simplified because the EndpointSlice created here will always have a consistent name (nameParam here, but generally just kubernetes). I think it would make sense to test based on the existence of that EndpointSlice specifically with a Get call instead of a List.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I have updated, PTAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tnqn, this looks great, I think it could be simplified one more step to just use the deepEqual whether or not the expected value was nil. Something like what's done above for Endpoints: https://github.com/kubernetes/kubernetes/blob/master/pkg/master/reconcilers/endpointsadapter_test.go#L262-L264

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

@tnqn
Copy link
Member Author

tnqn commented Oct 29, 2019

@robscott thanks for the suggestion!

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 29, 2019
@robscott
Copy link
Member

robscott commented Oct 29, 2019

Thanks for catching this @tnqn! This fix works well for me.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 29, 2019
@robscott
Copy link
Member

@liggitt are you able to take a look here? I think you helped review/approve my initial PR here back in the day. This looks like a good fix that should help anyone upgrading to a cluster with EndpointSlices enabled.

/assign @liggitt

@liggitt
Copy link
Member

liggitt commented Oct 29, 2019

This was because the Endpoint object had been created and ReconcileEndpoints would skip creating or updating it after EndpointSlice feature is enabled.

Can you clarify this point? Does the endpoints controller not reconcile the kube-system/kubernetes Endpoints object to an EndpointsSlice object? Or is the issue that the masters must bootstrap their EndpointsSlice object just like their Endpoints object in order to make sure the API server is reachable from wherever the kube-controller-manager is running?

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 30, 2019
@tnqn
Copy link
Member Author

tnqn commented Oct 30, 2019

This was because the Endpoint object had been created and ReconcileEndpoints would skip creating or updating it after EndpointSlice feature is enabled.

Can you clarify this point? Does the endpoints controller not reconcile the kube-system/kubernetes Endpoints object to an EndpointsSlice object?
Or is the issue that the masters must bootstrap their EndpointsSlice object just like their Endpoints object in order to make sure the API server is reachable from wherever the kube-controller-manager is running?

The endpoints controller is not in charge of the default/kubernetes Endpoints and EndpointSlice:

if service.Spec.Selector == nil {
// services without a selector receive no endpoint slices from this controller;
// these services will receive endpoint slices that are created out-of-band via the REST API.
return nil
}

It's the EndpointReconciler (leaseEndpointReconciler by default) in masters reconciling its Endpoints and EndpointSlice:
func (r *leaseEndpointReconciler) doReconcile(serviceName string, endpointPorts []corev1.EndpointPort, reconcilePorts bool) error {

I think it's the same issue you mentioned that EndpointSlices for masters must be created by themselves to make kube-controller-manager reach them.

@tnqn
Copy link
Member Author

tnqn commented Oct 30, 2019

/retest

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 31, 2019
@tnqn
Copy link
Member Author

tnqn commented Oct 31, 2019

/retest

@tnqn tnqn force-pushed the missing-endpointslice branch 2 times, most recently from 6933823 to f3041fa Compare October 31, 2019 12:02
The EndpointSlice for masters was not created after enabling
EndpointSlice feature on a pre-existing cluster. This was because the
Endpoint object had been created and ReconcileEndpoints would skip
creating or updating it after EndpointSlice feature is enabled.

This patch ensures EndpointSlice is consistent with Endpoints after the
reconciler reconciles Endpoints even if Endpoints is unchanged. It also
avoids an update if the desired EndpointSlice matches the existing one.
@liggitt
Copy link
Member

liggitt commented Oct 31, 2019

/approve

@robscott has final lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, tnqn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 31, 2019
@tnqn
Copy link
Member Author

tnqn commented Oct 31, 2019

Thanks! @liggitt

/retest

@robscott
Copy link
Member

Thanks for all your work on this @tnqn!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 31, 2019
@robscott
Copy link
Member

/retest

@k8s-ci-robot k8s-ci-robot merged commit 4002e4c into kubernetes:master Oct 31, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Oct 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EndpointSlice is not created after enabling EndpointSlice on a pre-existing cluster
4 participants