Support for non-EKS IRSA #3560

thefirstofthe300 · 2022-06-27T21:31:51Z

/kind feature

IRSA appears to have been complete for EKS clusters in #2054. My company currently uses non-EKS clusters in combination with IRSA. I'd like the ability to install IRSA using the provider. The main piece that is tricky to coordinate is the certificate authority used to sign and validate the JWTs. If these pieces could be automated as part of the ignition config (we also use Flatcar), our burden of installation/maintenance would be greatly decreased.

Environment:

Cluster-api-provider-aws version: v1.4
Kubernetes version: (use kubectl version): v1.24
OS (e.g. from /etc/os-release): Flatcar Container Linux by Kinvolk 3139.2.3 (Oklo)

The text was updated successfully, but these errors were encountered:

thefirstofthe300 · 2022-08-30T23:50:18Z

I managed to POC using the built-in well-known OID configuration endpoint in the API server (kubernetes/kubernetes#98553); however, it currently requires bringing my own infra and knowing the CA cert's fingerprint. Here's my configuration:

BYO load balancer listening on port 443 and forwarding traffic to the API server
BYO load balancer security group with the proper configuration to allow traffic to the ELB on port 443
Add a cluster-role allowing unauthenticated users to fetch the /.well-known/openid-configuration endpoint kubectl create clusterrolebinding oidc-reviewer --clusterrole=system:service-account-issuer-discovery --group=system:unauthenticate
Create the OIDC provider with the CA cert's fingerprint

Unfortunately, I don't really have time to take a stab at contributing this feature to CAPA right now or I'd try to add this to CAPA myself.

k8s-triage-robot · 2022-11-28T23:56:13Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thefirstofthe300 · 2022-11-29T22:16:11Z

/remove-lifecycle stale

luthermonson · 2022-12-28T22:49:59Z

I just had to spec this out and will code it into CAPA shortly. I'll add my approach and I would like some more eyes on it before I spend too much time coding. First, I'm basing more of my knowledge on how to even get this working on an example written by @smalltown located here: https://github.com/smalltown/aws-irsa-example and accompanying blog article: https://medium.com/codex/irsa-implementation-in-kops-managed-kubernetes-cluster-18cef84960b6

Approach

Users will have to create an iaas cluster with bucket management turned on, this is a feature used in capa to manage an s3 bucket for ignition userdata and we will reuse this existing functionality to create/delete a bucket per cluster.
Make associateOidcProvider: true work for iaas clusters, it will use cert manager to gen certs and add keys.json and .well-know/openid-configuration and make them public read and then create the oidc provider in IAM and generate a trust policy and store it in a config map which you can use later in your roles which is idtentical to how managed clusters work.
Install https://github.com/aws/amazon-eks-pod-identity-webhook by taking the manifests and moving them into ./config, modify it to use a cert from cert-manager and have capa deploy it to the cluster.
As per expectations in EKS' implementation, capa will not build the irsa roles for you. The user is expected to use the trust policy located in a configmap and do this themselves. The user will create the role with the trust policies they need for their service and add the contents from this configmap to make a working role. The user is also expected to take the arn from that role and use it as an annotation eks.amazonaws.com/role-arn when they make a service account and assign it to a pod.

@thefirstofthe300 please read the above and see if you agree with the approach, while I like your idea of using the ServiceAccountIssuerDiscovery but I feel having capa spin up any extra infra is a bit too much to ask of users just to get IRSA support when the s3 buckets are already being used/managed by capa and have free public URLs. The only pieces capa would need to manage are some certs, iam oidc provider, webhook and the contents of the files in s3.

thefirstofthe300 · 2023-01-01T03:02:11Z

Either way will work. I think the main question comes down to how certs should work.

In essence, the way I went about creating things is identical to the way CAPA creates it's infrastructure; however, instead of generating everything using port 6443 for the API server ELB serving port, it uses port 443. I'd think providing the ability to set the serving port to 443 over 6443 would be relatively simple in the CAPA code and remove the need for the BYO infra.

The two things I've had to shove into my configs are adding the root ca cert to the API serving cert chain using Ignition and adding the cluster binding to allow unauthenticated users to view the openid config.

Doing this removes the need to rely on S3 to serve the configuration.

The main piece that I have a question on is whether people want to serve the root cert with their serving chain. Ultimately, I don't think it opens any security holes as the certificates themselves are basically a public key and no change is made to the TLS private key.

I'd be interested in hearing some input from someone more in tune with the security side.

dlipovetsky · 2023-01-23T17:55:37Z

How much of this can we solve with documenation vs. code and cluster/infrastructure component template changes?

/triage needs-information

luthermonson · 2023-01-31T21:42:09Z

@dlipovetsky fair question but all the pieces together make this a complex feature as it mixes aws resources (s3, oidc provider) and kube resources (cert, deployment) and the one thing which knows about all those things is capa and can only be built at cluster creation time. When I get this written we will have feature parity between iaas and eks for IRSA and i feel it's only fair that capa does the work to make feature parity between the two cluster types

Skarlso · 2023-02-01T20:18:58Z

@dlipovetsky I think I agree with Luther. I don't believe a document change can handle this. The IRSA logic requires a fair amount of work between the pods, assigning things, creating connections, adding the webhook, generating certificates etc.

Honestly, this is quite a large piece of work. @luthermonson and @thefirstofthe300, do you think we could break this down into several more minor issues and use this issue as an epic to track each piece?

Such as creating the configuration, supporting 443, adding the webhook installation process, etc.

k8s-triage-robot · 2023-05-02T21:02:50Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Skarlso · 2023-05-06T19:22:14Z

/remove-lifecycle stale

Skarlso · 2023-05-06T19:22:38Z

/triage accepted

k8s-triage-robot · 2024-05-05T19:32:23Z

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

Confirm that this issue is still relevant with /triage accepted (org members only)
Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-ci-robot · 2024-05-05T19:32:27Z

This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-triage-robot · 2024-08-03T19:58:54Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-09-02T20:36:30Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 27, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 28, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 29, 2022

k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Jan 23, 2023

luthermonson linked a pull request Feb 22, 2023 that will close this issue

Add IRSA Support to Self Managed Clusters #4094

Open

4 tasks

luthermonson mentioned this issue Mar 23, 2023

add IRSA for self-managed clusters proposal #4164

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 2, 2023

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 6, 2023

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 6, 2023

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels May 5, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2024

sl1pm4t mentioned this issue Aug 21, 2024

✨ feat: Add IRSA Support for Self Managed Clusters (2024 edition) #5101

Closed

5 tasks

sl1pm4t linked a pull request Aug 28, 2024 that will close this issue

✨ feat: Add IRSA support for self-managed clusters (rebase) #5109

Open

5 tasks

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for non-EKS IRSA #3560

Support for non-EKS IRSA #3560

thefirstofthe300 commented Jun 27, 2022

thefirstofthe300 commented Aug 30, 2022

k8s-triage-robot commented Nov 28, 2022

thefirstofthe300 commented Nov 29, 2022

luthermonson commented Dec 28, 2022

thefirstofthe300 commented Jan 1, 2023

dlipovetsky commented Jan 23, 2023

luthermonson commented Jan 31, 2023

Skarlso commented Feb 1, 2023

k8s-triage-robot commented May 2, 2023

Skarlso commented May 6, 2023

Skarlso commented May 6, 2023

k8s-triage-robot commented May 5, 2024

k8s-ci-robot commented May 5, 2024

k8s-triage-robot commented Aug 3, 2024

k8s-triage-robot commented Sep 2, 2024

Support for non-EKS IRSA #3560

Support for non-EKS IRSA #3560

Comments

thefirstofthe300 commented Jun 27, 2022

thefirstofthe300 commented Aug 30, 2022

k8s-triage-robot commented Nov 28, 2022

thefirstofthe300 commented Nov 29, 2022

luthermonson commented Dec 28, 2022

thefirstofthe300 commented Jan 1, 2023

dlipovetsky commented Jan 23, 2023

luthermonson commented Jan 31, 2023

Skarlso commented Feb 1, 2023

k8s-triage-robot commented May 2, 2023

Skarlso commented May 6, 2023

Skarlso commented May 6, 2023

k8s-triage-robot commented May 5, 2024

k8s-ci-robot commented May 5, 2024

k8s-triage-robot commented Aug 3, 2024

k8s-triage-robot commented Sep 2, 2024