Helm-controller pod is using stale tokens #479

albertschwarzkopf · 2022-05-10T08:00:54Z

Hi,

the "Bound Service Account Token Volume" is graduated to stable and enabled by default in Kubernetes version 1.22.
I am using "helm-controller:v0.21.0" in AWS EKS 1.22 and I have checked, if it is using stale tokens (regarding https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html and https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting.html#troubleshooting-boundservicetoken).

So when the API server receives requests with tokens that are older than one hour, then it annotates the pod with "annotations.authentication.k8s.io/stale-token". In my case I can see the following annotation. E.g.:
"annotations":{"authentication.k8s.io/stale-token":"subject: system:serviceaccount:flux-system:helm-controller, seconds after warning threshold: 56187"

Version:

helm-controller:v0.21.0

Cluster Details

AWS EKS 1.22

Steps to reproduce issue

Enable EKS Audit Logs
Query CW Insights (select cluster log group):

fields @timestamp
| filter @message like /seconds after warning threshold/
| parse @message "subject: *, seconds after warning threshold:*\"" as subject, elapsedtime

The text was updated successfully, but these errors were encountered:

stefanprodan · 2022-05-10T08:03:59Z

@albertschwarzkopf can you please confirm this happens with kustomize-controller also?

albertschwarzkopf · 2022-05-10T08:06:42Z

@stefanprodan thanks for the fast reply!

No helm-controller only.
kustomize-controller is running in version 0.25.0

Also no issue with notification-controller:v0.23.5 and source-controller:v0.24.4

stefanprodan · 2022-05-10T08:09:21Z

Does kustomize-controller runs on the the same node as helm-controller? Can you please post here kubectl -n flux-system get pods -owide

albertschwarzkopf · 2022-05-10T08:12:51Z

No there are running on different nodes at this moment (we have several nodes).

stefanprodan · 2022-05-10T08:22:23Z

I see that kustomize-controller was restarted recently, wait one hour and report back please if kustomize-controller runs into the same issue. I'm trying to figure out if this is something specific to helm-controller or is a general problem with Kubernetes client-go on EKS.

pjbgf · 2022-05-10T08:43:48Z

Relates to fluxcd/flux2#2074

albertschwarzkopf · 2022-05-10T09:01:01Z

I see that kustomize-controller was restarted recently, wait one hour and report back please if kustomize-controller runs into the same issue. I'm trying to figure out if this is something specific to helm-controller or is a general problem with Kubernetes client-go on EKS.

After 72 minutes no issue with kustomize-controller...

stefanprodan · 2022-05-10T11:24:40Z

I've created an EKS cluster:

$ kubectl version
Server Version: v1.22.6-eks-14c7a48

I've waited one hour:

$ kubectl -n flux-system get po
NAME                                       READY   STATUS    RESTARTS   AGE
helm-controller-88f6889c6-pwf7f            1/1     Running   0          73m
kustomize-controller-784bd54978-bckm6      1/1     Running   0          73m
notification-controller-648bbb9db7-58c2d   1/1     Running   0          73m
source-controller-79f7866bc7-k25z5         1/1     Running   0          73m

And there is no stale-token annotation on the pod:

$ kubectl -n flux-system get po helm-controller-88f6889c6-pwf7f -oyaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    container.seccomp.security.alpha.kubernetes.io/manager: runtime/default
    kubernetes.io/psp: eks.privileged
    prometheus.io/port: "8080"
    prometheus.io/scrape: "true"
  creationTimestamp: "2022-05-10T10:08:59Z"
  generateName: helm-controller-88f6889c6-
  labels:
    app: helm-controller
    pod-template-hash: 88f6889c6
  name: helm-controller-88f6889c6-pwf7f
  namespace: flux-system

albertschwarzkopf · 2022-05-10T11:45:09Z

Yes I can confirm this. Maybe it is visible only in the Audit Logs:

hiddeco · 2022-05-10T13:45:32Z

@albertschwarzkopf can you give the first mentioned image in #480 a try, and if that does not yield results, the second?

albertschwarzkopf · 2022-05-11T14:30:56Z

@hiddeco thanks! I have tried both images today. Image ghcr.io/hiddeco/helm-controller:head-412201a has worked like expected only. So I cannot see the mentioned annotation in the audit logs even after 1 hour.

hiddeco · 2022-05-11T14:51:05Z

Thanks for confirming. I'll finalize the PR in that case, and make sure it is included in next release.

Alan01252 · 2022-05-12T09:03:43Z

Note we even got an automated email about this from aws!

As of April 20th 2022, we have identified the below service accounts attached to pods in one or more of your EKS clusters using stale (older than 1 hour) tokens. Service accounts are listed in the format : |namespace:serviceaccount

arn:aws:eks:eu-west-2::cluster/prod-|kube-system:multus
arn:aws:eks:eu-west-2::cluster/prod-**|flux-system:helm-controller

This also totally explains fluxcd/flux2#2074 ( and the correlation between multus + helm we saw )

balonik · 2022-05-12T09:27:44Z

Got same message from AWS. Only helm-controller SA was flagged. All controllers are running for the same period of time.

NAME                                           READY   STATUS    RESTARTS      AGE
helm-controller-5676d55dff-7lgvn               1/1     Running   0             16d
image-automation-controller-6444ccb58c-8xcls   1/1     Running   0             16d
image-reflector-controller-f64677dd5-974qs     1/1     Running   0             16d
kustomize-controller-76f9d4f99f-htp8d          1/1     Running   0             16d
notification-controller-846fff6d67-h677q       1/1     Running   0             16d
source-controller-55d799ff7d-w598g             1/1     Running   0             16d

luong-komorebi · 2022-05-12T09:38:38Z

We got the notification message from AWS as well, but just for the helm-controller, albeit all pods are up and running 85 days long

valeriano-manassero · 2022-05-12T11:33:04Z

I can confirm same problem here on EKS v1.22.6-eks-7d68063. Not sure if it's interesting or related, but, after moving to EKS 1.22 authentication for client changed from client.authentication.k8s.io/v1alpha1 to client.authentication.k8s.io/v1beta1 .

hiddeco · 2022-05-12T12:27:16Z

As already mentioned in #479 (comment). We have identified the issue, staged a patch, and this will be solved on next release.

stefanprodan added the bug Something isn't working label May 10, 2022

hiddeco mentioned this issue May 10, 2022

kube: load KubeConfig (token) from FS on every reconcile #480

Merged

hiddeco closed this as completed in #480 May 12, 2022

hiddeco mentioned this issue May 12, 2022

helm-controller not refreshing service account tokens #483

Closed

kingdonb mentioned this issue May 12, 2022

EKS v1.22 upgrade triggers Operational Notification from AWS regarding BoundServiceAccountToken fluxcd/flux#3610

Closed

2 tasks

pjbgf mentioned this issue May 16, 2022

Helm-controller is not refreshing service account tokens #485

Closed

s4rd1nh4 mentioned this issue Jun 3, 2022

[Addon] Feat - bump deps flux kubevela/catalog#360

Merged

3 tasks

thiagokokada mentioned this issue Jun 7, 2022

fluxcd: 0.30.2 -> 0.31.0 NixOS/nixpkgs#176660

Merged

wilkejo mentioned this issue Aug 24, 2022

Kube Ingress AWS Controller pod is using stale tokens zalando-incubator/kube-ingress-aws-controller#535

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helm-controller pod is using stale tokens #479

Helm-controller pod is using stale tokens #479

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

pjbgf commented May 10, 2022

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

albertschwarzkopf commented May 10, 2022

hiddeco commented May 10, 2022

albertschwarzkopf commented May 11, 2022

hiddeco commented May 11, 2022

Alan01252 commented May 12, 2022 •

edited

Loading

balonik commented May 12, 2022

luong-komorebi commented May 12, 2022 •

edited

Loading

valeriano-manassero commented May 12, 2022 •

edited

Loading

hiddeco commented May 12, 2022

Helm-controller pod is using stale tokens #479

Helm-controller pod is using stale tokens #479

Comments

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

pjbgf commented May 10, 2022

albertschwarzkopf commented May 10, 2022

stefanprodan commented May 10, 2022

albertschwarzkopf commented May 10, 2022

hiddeco commented May 10, 2022

albertschwarzkopf commented May 11, 2022

hiddeco commented May 11, 2022

Alan01252 commented May 12, 2022 • edited Loading

balonik commented May 12, 2022

luong-komorebi commented May 12, 2022 • edited Loading

valeriano-manassero commented May 12, 2022 • edited Loading

hiddeco commented May 12, 2022

Alan01252 commented May 12, 2022 •

edited

Loading

luong-komorebi commented May 12, 2022 •

edited

Loading

valeriano-manassero commented May 12, 2022 •

edited

Loading