NLB support #3088

carlosonunez · 2022-01-20T21:55:02Z

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]

As an engineer looking to deploy Kubernetes with CAPA
I would like my control plane to be fronted by a Network Load Balancer instead of Classic ELBs
So that I can continue to use CAPA after AWS deprecates them

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Classic ELBs are getting deprecated in August 2022, and AWS SDK for Go supports NLBs. We should use them!

More info here

k8s-ci-robot · 2022-01-20T21:55:09Z

@carlosonunez: This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sedefsavas · 2022-01-20T21:57:29Z

Thanks for bringing this up @carlosonunez
I think we are tracking this somewhere else, will find out and link these.

There is an ongoing test to see how to deal with hairpinning issue comes with NLBs (when LB directs Kubelet traffic to the api-server on the same node).

carlosonunez · 2022-01-20T21:59:10Z

That's great that this is being tracked elsewhere! I couldn't find any issues pertaining to it and thought "That can't be right!" :)

cbusch-pivotal · 2022-01-24T17:42:41Z

Any update on whether this truly is being tracked? thanks

dlipovetsky · 2022-01-24T18:58:59Z

I haven't seen issue tracking NLB support, but we considered using NLB back in 2018: #115 (comment)

dlipovetsky · 2022-01-24T19:05:54Z

There is an ongoing test to see how to deal with hairpinning issue comes with NLBs (when LB directs Kubelet traffic to the api-server on the same node).

For reference, CAPZ also has a hairpinning issue, although specifically when using an internal load balancer: kubernetes-sigs/cluster-api-provider-azure#1033 (comment). The current workaround is to prevent requests from kubelet on the control plane machines from going through the LB, by resolving NLB DNS record to 127.0.0.1.

sedefsavas · 2022-01-24T22:30:45Z

I could not find a specific issue for this, I see this was discussed in CAPA v1alpha4 API changes document but during today's office hours, we triaged this issue and keeping this issue.

sedefsavas · 2022-01-24T22:33:43Z

For reference, CAPZ also has a hairpinning issue, although specifically when using an internal load balancer: kubernetes-sigs/cluster-api-provider-azure#1033 (comment). The current workaround is to prevent requests from kubelet on the control plane machines from going through the LB, by resolving NLB DNS record to 127.0.0.1.

Thanks @dlipovetsky, good to know there is an example workaround. If I can't make it work by disabling client IP preservation, this might be the way to go.

lubronzhan · 2022-01-25T22:37:38Z

As suggested by @sedefsavas, changing the target group type from instance to IP, can also workaround this issue https://aws.amazon.com/premiumsupport/knowledge-center/target-connection-fails-load-balancer/
But the downside of this is the IP of instance might change during lifecycle, the corresponding target group needs to be updated, also the NLB's listeners needs to be updated as well. This could cause downtime.

Update
With instance type target, I can still curl the NLB endpoint within control plane, with appropriate SecurityGroup

sedefsavas · 2022-02-17T07:59:26Z

Just a correction on retirement of Classic load balancers mentioned in the issue:

EC2 Classic Networking is being retired not the Classic Load Balancers.
Classic Load Balancers that are being used in EC2 Classic networking will need to be migrated to an LB in a VPC. If Classic Load Balancers are already in a VPC, they won't be affected.

TL;DR August 2022 retirement date is not for Classic ELBs.

carlosonunez · 2022-02-17T15:26:05Z

UPDATE: Never mind. This deprecation only applies to ELBs in EC2-Classic, as @sedefsavas said. It's a bit confusing since this article recommends migrating to ELBv2 resources. The language is SUPER confusing, but according to this AWS article, ELBs (aka Classic LBs) are in scope for retirement as well: https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/migrate-classic-load-balancer.html#migrate-step-by-step-classiclink

…

On Feb 17, 2022, 01:59 -0600, sedefsavas ***@***.***>, wrote: Just a correction on retirement of Classic load balancers mentioned in the issue: EC2 Classic Networking is being retired not the Classic Load Balancers. Classic Load Balancers that are being used in EC2 Classic networking will need to be migrated to an LB in a VPC. If Classic Load Balancers are already in a VPC, they won't be affected. TL;DR August 2022 retirement date is not for Classic ELBs. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: ***@***.***>

anmolsachan · 2022-03-01T10:32:21Z

This feature(NLB for control-plane) is very much required in the market irrespective of the CLB still being there.
Please let me know if there is a need for a contribution to this.

vincepri · 2022-04-11T17:35:33Z

Are we planning to convert the existing Classic ELBs to NLBs?

k8s-triage-robot · 2022-07-10T18:03:00Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

voor · 2022-07-10T18:58:25Z

/remove-lifecycle stale

sedefsavas · 2022-07-28T18:07:21Z

While implementing this, we can avoid introducing the problem of creating LBs with identical names for clusters created with same name in different namespaces: #969

NLB names need to be unique in the same AWS account and region.

Skarlso · 2022-11-02T16:04:05Z

Hmm, I mean, generally having multiple clusters in the same VPC will end up with that cluster be able to access other clusters on the same network sort of, right?

voor · 2022-11-02T16:04:18Z

Just wondering if you're over-rotating on something that shouldn't be considered a problem -- what is the concern with something going straight to the EC2 instance's port 6443? It's not being advertised, and even if something did hit it directly it's the same security as if it went through the NLB. There is nothing magical happening at the NLB to shape or modify that traffic, the NLB is effectively some iptables magic itself to handle load balancing and forwarding the proper traffic.

Skarlso · 2022-11-02T16:08:18Z

@voor Right. However, opening up 6443 to the world is a bit of a stretch, isn't it? :) Even if it isn't advertised. And yes, NLB is just some traffic routing magic it doesn't modify the request at all. But you still would want some amount of security regarding traffic coming to a port. You generally don't open any ports to the outside world.

Skarlso · 2022-11-02T16:12:28Z

BTW, on it being public. With ipv6 there are no private addresses. So this because a moot point. If you use an IPv6 VPC the port will, "technically" be open to the world anyways.

Skarlso · 2022-11-02T16:27:04Z

@voor I get what you are saying, the reason I need the port opened is that right now it only has it open for a security group as a source. Which doesn't work with the NLB. But yes, since the node is in a private network, opening it to 0.0.0.0/0 shouldn't be a problem, but I would like to be absolutely sure about that. :D

For example, I'm not sure / don't know if you can have control plane nodes in public networks if you so choose.

Skarlso · 2022-11-02T16:29:32Z

You can technically have control plane nodes in public subnets if you bring your own cluster. I'm saying technically only because it's not really advised... :D

Skarlso · 2022-11-03T18:43:18Z

Successfully created an NLB. Now, lots of testing remaining and making it work with IPv6. ( that shouldn't be a problem. :) )

johannesfrey · 2022-11-04T07:20:49Z

@voor I get what you are saying, the reason I need the port opened is that right now it only has it open for a security group as a source. Which doesn't work with the NLB. But yes, since the node is in a private network, opening it to 0.0.0.0/0 shouldn't be a problem, but I would like to be absolutely sure about that. :D

For example, I'm not sure / don't know if you can have control plane nodes in public networks if you so choose.

I'm just wondering (if opening to 0.0.0.0/0 should be the default way to go here) if we still have a (configurable) way to restrict the access to the api-server for specific clients? Or is this a general "problem" of NLBs, as they don't work with SGs and the SG should rather be seen in conjunction with the actual EC2 instance of the control-plane? If so, in addition to the client IP that should be granted access, do you know if the internal IP of the NLB must also be added to the target security group of the instance?

Skarlso · 2022-11-04T07:25:11Z

The NLB doesn't really have an internal IP OR it's constantly changing. So that would be a difficult manner. However, since the NLB is part of the VPC, it's enough to add the VPC CIDR to the security group. That way, any IP that the NLB will get will be allowed, and at the very least, it's not 0.0.0.0. :)

And yes, that seems to be the consensus when dealing with NLBs since they don't have security group attachments similar to NAT gateways. It's a Layer 4 load balancer. Historically though, as I've read in many forums by AWS employees, there is no REAL reason that they couldn't make it work with security groups; it's a deliberate decision. Which we have no privy to, sadly. :)

johannesfrey · 2022-11-04T07:44:31Z

Ah, sure. 0.0.0.0 is not necessary 🙂.
Yeah, really a bummer that NLBs do not work with security groups in the usual way.
But speaking of a SG in front of the EC2 instance of the control-plane, it would not suffice to add the client IP as source (with client IP preservation on) because in reality the NLB connects to that instance (with an unknown internal IP)?

On a side note: @Skarlso, thanks so much for your quick replies and that you are driving this forward. Awesome! 👍

Skarlso · 2022-11-04T07:58:42Z

No problem. :) I like working on this issue. :D It's very complex. :)

But speaking of a SG in front of the EC2 instance of the control-plane, it would not suffice to add the client IP as source (with client IP preservation on) because in reality the NLB connects to that instance (with an unknown internal IP)?

I'm not a 100% sure yet to be honest. I don't see with what IP address the nlb tries contacting the instance. If it's using preservation I'm not sure which client ip it preserves.

Skarlso · 2022-11-04T08:10:16Z

Incidentally, that might be the cause of my current problem. Namely that the instance is not joining the cluster. Or more precisely, it looks like kubelet is failing to start up.

Skarlso · 2022-11-04T09:59:37Z

BOOM

➜  k get nodes --kubeconfig test.kubeconfig
NAME                                           STATUS   ROLES                  AGE     VERSION
ip-10-0-129-98.eu-central-1.compute.internal   Ready    control-plane,master   5m55s   v1.22.1


➜  cluster clusterctl describe cluster test-aws-cluster
NAME                                                                 READY  SEVERITY  REASON  SINCE  MESSAGE
Cluster/test-aws-cluster                                             True                     6m2s
├─ClusterInfrastructure - AWSCluster/test-aws-cluster                True                     8m30s
└─ControlPlane - KubeadmControlPlane/test-aws-cluster-control-plane  True                     6m2s
  └─Machine/test-aws-cluster-control-plane-pjnbs                     True                     8m10s

Skarlso · 2022-11-04T11:06:46Z

Okay, so this work in one of two ways:

Either:

Disable client IP preservation and allow for the VPC's private IP address to go through only
Or:
Enable client IP preservation, but in that case, I have to open it up with 0.0.0.0/0 for it to work

What I'm going to do is, create a parameter/switch that lets the user control this behaviour. Is that okay?

johannesfrey · 2022-11-04T11:40:17Z

Okay, so this work in one of two ways:

Either:
* Disable client IP preservation and allow for the VPC's private IP address to go through only
  Or:

* Enable client IP preservation, but in that case, I have to open it up with `0.0.0.0/0` for it to work
What I'm going to do is, create a parameter/switch that lets the user control this behaviour. Is that okay?

Ah ok. So with client IP preservation on you must open it up for 0.0.0.0/0 because you cannot know beforehand which client tries to connect (e.g. from another VPC or externally)? In contrast to client IP preservation being off, the source will always be the NLB IP, which resides in the same VPC?!

Skarlso · 2022-11-04T11:42:04Z

Ah ok. So with client IP preservation on you must open it up for 0.0.0.0/0 because you cannot know beforehand which client tries to connect (e.g. from another VPC or externally)? In contrast to client IP preservation being off, the source will always be the NLB IP, which resides in the same VPC?!

Correct. 👍

johannesfrey · 2022-11-04T12:02:29Z

Ah ok. So with client IP preservation on you must open it up for 0.0.0.0/0 because you cannot know beforehand which client tries to connect (e.g. from another VPC or externally)? In contrast to client IP preservation being off, the source will always be the NLB IP, which resides in the same VPC?!

Correct. +1

Ok, got it. One naive thought: So instead of prematurely opening it up completely for 0.0.0.0/0 here with CAPA, couldn't we delegate the work to or do it similar to e.g. the aws-load-balancer-controller, which picks up the given CIDR from service.spec.loadBalancerSourceRanges and creates the necessary SGs/IngressRules only for this CIDR?

Though, this would depend on a Service of type LoadBalancer for the api-server... And this would also probably be too late, as the opening up is required during the Cluster creation phase?! Though, kubelet should not rely on it, as it will talk to the local api-server because of the /etc/hosts rewrite?! But most probably I'm mixing up stuff here 😅

Skarlso · 2022-11-04T12:05:33Z

And this would also probably be too late, as the opening up is required during the Cluster creation phase

This is correct. Without opening it up the node will not join the cluster and kubelet will fail with various errors or things that it can't reach.

Some of which MIGHT be the hair pinning issue which I'm investigating. For example just this:

Nov 04 08:23:45 ip-10-0-118-234 kubelet[1810]: E1104 08:23:45.087874    1810 kubelet.go:2332] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Nov 04 08:23:45 ip-10-0-118-234 kubelet[1810]: E1104 08:23:45.156058    1810 kubelet.go:2407] "Error getting node" err="node \"ip-10-0-118-234.eu-central-1.compute.internal\" not found"
Nov 04 08:23:45 ip-10-0-118-234 kubelet[1810]: E1104 08:23:45.256934    1810 kubelet.go:2407] "Error getting node" err="node \"ip-10-0-118-234.eu-central-1.compute.internal\" not found"

Looks very suspiciously like a hair pinning issue where kubelet can't access the node using its domain name.

Skarlso · 2022-11-04T13:46:14Z

Okay, so I got something working for now. I'm going to roll with that, then iterate over the possible hair pinning issue.

Skarlso · 2022-11-06T06:23:45Z

So I tried the hosts rewrite. Sadly, it didn't work. :(

Nov 06 06:22:43 ip-10-0-120-185 kubelet[2114]: E1106 06:22:43.455629    2114 kubelet.go:2407] "Error getting node" err="node \"ip-10-0-120-185.eu-central-1.compute.internal\" not found"
Nov 06 06:22:43 ip-10-0-120-185 kubelet[2114]: E1106 06:22:43.554024    2114 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://ypcqehss1nfq26i6gfooheufe2ok-k8s-3f5da43546a9b6c8.elb.eu-central-1.amazonaws.com:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-0-120-185.eu-central-1.compute.internal?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Nov 06 06:22:43 ip-10-0-120-185 kubelet[2114]: E1106 06:22:43.558110    2114 kubelet.go:2407] "Error getting node" err="node \"ip-10-0-120-185.eu-central-1.compute.internal\" not found"

So for now, I'm staying with my solution. FFW the network loadbalancer doc troubleshoot section also says that we should disable preserve client ip if hairpinning is an issue.

voor · 2022-11-06T11:16:43Z

Does it set the X-Forwarded-For request header or X-Real-Ip header? That way audit logs will still have the client IP somewhere.

This is extremely unlikely since nlb is layer4 and headers are layer 7

Skarlso · 2022-11-06T11:18:16Z

It does, yes.

Skarlso · 2022-11-06T12:34:08Z

Ah, yeah, I misread the doc. You're right @voor it does not set that header. Something else does if client ip preservation is on.

johannesfrey · 2022-11-16T10:21:41Z

Just out of interest, as it's seems not be tracked in a milestone. I assume it won't be part of the v2.0.0 milestone?!

Skarlso · 2022-11-16T10:39:16Z

Probably not, because the PR needs reviews, tests, etc. That won't happen now, the team is pushing for 2.0.0 milestones and fixes now. Sorry! :/

johannesfrey · 2022-11-16T10:54:34Z

Probably not, because the PR needs reviews, tests, etc. That won't happen now, the team is pushing for 2.0.0 milestones and fixes now. Sorry! :/

Sure, no worries 🙂. Totally understandable. Just noticed that 2.0.0 is about to be released soon, so I thought I ask. Really nice that 2.0.0 is coming, even though without NLB 🙂 Unfortunately, I'm not too deep into the code base but I would also take a look at the PR (this should not block any further changes etc. :) next week, latest.

Skarlso · 2022-12-02T14:10:53Z

There are two hanging things:

e2e test implementation [E2E] Add tests for NLB feature #3820
adding SSL for listeners / target group Add TLS listener to NLB implementation #3899

But I could consider this closed. Please test and see if anything is missing / not working. :)

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 20, 2022

k8s-ci-robot removed the needs-priority label Jan 24, 2022

sedefsavas added this to the v1.x milestone Jan 24, 2022

sedefsavas self-assigned this Jan 24, 2022

richardcase mentioned this issue Feb 7, 2022

Don't attempt to recreate an ELB after the cluster has been provisioned #2608

Closed

Ankitasw mentioned this issue Mar 23, 2022

<clustername>-node security group contains a rule that overshadows <clustername>-lb security group rules #3271

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 10, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 10, 2022

sedefsavas changed the title ~~Use NLBs instead of Classic ELBs~~ NLB support Jul 28, 2022

Skarlso mentioned this issue Nov 7, 2022

[E2E] Add tests for NLB feature #3820

Closed

richardcase removed this from the v1.x milestone Nov 10, 2022

k8s-ci-robot closed this as completed in #3804 Dec 2, 2022

NLB support #3088

NLB support #3088

Comments

carlosonunez commented Jan 20, 2022

k8s-ci-robot commented Jan 20, 2022

sedefsavas commented Jan 20, 2022 • edited Loading

carlosonunez commented Jan 20, 2022

cbusch-pivotal commented Jan 24, 2022

dlipovetsky commented Jan 24, 2022

dlipovetsky commented Jan 24, 2022

sedefsavas commented Jan 24, 2022

sedefsavas commented Jan 24, 2022

lubronzhan commented Jan 25, 2022 • edited Loading

sedefsavas commented Feb 17, 2022

carlosonunez commented Feb 17, 2022 via email • edited Loading

anmolsachan commented Mar 1, 2022 • edited Loading

vincepri commented Apr 11, 2022

k8s-triage-robot commented Jul 10, 2022

voor commented Jul 10, 2022

sedefsavas commented Jul 28, 2022

Skarlso commented Nov 2, 2022

voor commented Nov 2, 2022

Skarlso commented Nov 2, 2022

Skarlso commented Nov 2, 2022 • edited Loading

Skarlso commented Nov 2, 2022

Skarlso commented Nov 2, 2022

Skarlso commented Nov 3, 2022

johannesfrey commented Nov 4, 2022 • edited Loading

Skarlso commented Nov 4, 2022

johannesfrey commented Nov 4, 2022

Skarlso commented Nov 4, 2022

Skarlso commented Nov 4, 2022 • edited Loading

Skarlso commented Nov 4, 2022

Skarlso commented Nov 4, 2022

johannesfrey commented Nov 4, 2022 • edited Loading

Skarlso commented Nov 4, 2022

johannesfrey commented Nov 4, 2022

Skarlso commented Nov 4, 2022

Skarlso commented Nov 4, 2022

Skarlso commented Nov 6, 2022

voor commented Nov 6, 2022 • edited Loading

Skarlso commented Nov 6, 2022

Skarlso commented Nov 6, 2022

johannesfrey commented Nov 16, 2022

Skarlso commented Nov 16, 2022

johannesfrey commented Nov 16, 2022 • edited Loading

Skarlso commented Dec 2, 2022

sedefsavas commented Jan 20, 2022 •

edited

Loading

lubronzhan commented Jan 25, 2022 •

edited

Loading

carlosonunez commented Feb 17, 2022 via email •

edited

Loading

anmolsachan commented Mar 1, 2022 •

edited

Loading

Skarlso commented Nov 2, 2022 •

edited

Loading

johannesfrey commented Nov 4, 2022 •

edited

Loading

Skarlso commented Nov 4, 2022 •

edited

Loading

johannesfrey commented Nov 4, 2022 •

edited

Loading

voor commented Nov 6, 2022 •

edited

Loading

johannesfrey commented Nov 16, 2022 •

edited

Loading