Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods stuck on Terminating after EKS 1.26 update #1445

Closed
korjek opened this issue Sep 28, 2023 · 1 comment
Closed

Pods stuck on Terminating after EKS 1.26 update #1445

korjek opened this issue Sep 28, 2023 · 1 comment

Comments

@korjek
Copy link

korjek commented Sep 28, 2023

What happened:
Once EKS was updated to 1.26 and even with EKS 1.27 (nodes use EKS-optimized ami ami-0c92ea9c7c0380b66 (amazon-eks-node-1.27-v20230919)) some of the pods were stuck in Terminating indefinately.

At the same time, Karpenter recognizes the node as empty and deletes it (so the pod can be listed with kubectl, but its node is already gone),

What you expected to happen:
There are no pods that are stuck in Terminating.

How to reproduce it (as minimally and precisely as possible):
I'm not aware of a reliable way to reproduce it.

Environment:

  • AWS Region: us-east-1
  • Instance Type(s): doesn't matter, happens on different instance types.
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.5
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): 1.27
  • AMI Version: ami-0c92ea9c7c0380b66 (amazon-eks-node-1.27-v20230919)
  • Kernel (e.g. uname -a): Linux ip-10-34-44-24.node.domain 5.10.192-183.736.amzn2.x86_64 Template is missing source_ami_id in the variables section #1 SMP Wed Sep 6 21:15:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /etc/eks/release on a node):
BASE_AMI_ID="ami-0963f2c76238b64d5"
BUILD_TIME="Tue Sep 19 17:51:01 UTC 2023"
BUILD_KERNEL="5.10.192-183.736.amzn2.x86_64"
ARCH="x86_64"
@cartermckinnon
Copy link
Member

If the Pod is left in an indeterminate state and the instance no longer exists, that isn't a problem with the kubelet/AMI. It may be related to this issue with the pod garbage collector: kubernetes/kubernetes#118261

@cartermckinnon cartermckinnon closed this as not planned Won't fix, can't repro, duplicate, stale Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants