Instance failed to join status despite that instance is actually joined #719

a13x5 · 2021-08-02T14:32:17Z

What happened:
When creating eks node group with launch template with custom userdata it fails with "Instance failed to join the kubernetes cluster" error. But kubectl get node command shows nodes created by the node group.

What you expected to happen:
Node successfully joins the cluster.

How to reproduce it (as minimally and precisely as possible):
Create node group using launch template with custom userdata script

echo 'Custom user-data script'
/etc/eks/bootstrap.sh tst \
--kubelet-extra-args '--max-pods=100' \
--b64-cluster-ca <REDACTED> \
--apiserver-endpoint https://<REDACTED>.us-west-2.eks.amazonaws.com \
--use-max-pods false

Anything else we need to know?:
Without using launch template it works fine.
All resources were created using terraform code available in a gist
Environment:

AWS Region: us-west-2
Instance Type(s): m5a.large
EKS Platform version: eks.5
Kubernetes version: 1.19
AMI Version: amazon-eks-node-1.19-v20210512
Kernel (e.g. uname -a): Linux ip-10-0-5-52.us-west-2.compute.internal 5.4.110-54.189.amzn2.x86_64 Template is missing source_ami_id in the variables section #1 SMP Mon Apr 26 21:25:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Release information (run cat /etc/eks/release on a node):

BASE_AMI_ID="ami-004a571bc4ab7023a"
BUILD_TIME="Wed May 12 16:45:14 UTC 2021"
BUILD_KERNEL="5.4.110-54.189.amzn2.x86_64"
ARCH="x86_64"

The text was updated successfully, but these errors were encountered:

ravisinha0506 · 2021-08-02T18:58:25Z

Hi @Alex-Sizov,

Is this an intermittent behavior or happens every time?
Could you share the node group ARN, so that we can debug this from our end?
Could you share the labels you see on the node?

a13x5 · 2021-08-03T07:12:28Z

Hi @ravisinha0506 !

Yes this happens every time. i tried 4 times and each time all the same
Node group ARN is arn:aws:eks:us-west-2:560065381221:nodegroup/tst/tst/72bd78af-c09c-18bb-3796-c35618fa4130
Labels on node are following (from kubectl get node):

  labels:                                                                                     
    beta.kubernetes.io/arch: amd64                                                            
    beta.kubernetes.io/instance-type: m5a.large                                                                                                                                             
    beta.kubernetes.io/os: linux                                                                                                                                                            
    failure-domain.beta.kubernetes.io/region: us-west-2                                                                                                                                     
    failure-domain.beta.kubernetes.io/zone: us-west-2a                                                                                                                                      
    kubernetes.io/arch: amd64                                                                                                                                                               
    kubernetes.io/hostname: ip-10-0-5-52.us-west-2.compute.internal                                                                                                                         
    kubernetes.io/os: linux                                                                   
    node.kubernetes.io/instance-type: m5a.large                                               
    topology.kubernetes.io/region: us-west-2                                                  
    topology.kubernetes.io/zone: us-west-2a

deepanverma19 · 2021-08-04T07:03:04Z

@ravisinha0506 I am also facing the same issue as @Alex-Sizov +1

suket22 · 2021-08-10T16:20:42Z

@Alex-Sizov

When creating a Managed node group with a launch template, the behavior differs based on whether an AMI has been specified in the launch template or not.

When no AMI is present in the launch template (as is the case for you, if I'm reading your gist correctly), EKS will merge in a section of MIME multi-part user data to the user data contents you've passed in. The part EKS merges in will attempt to bootstrap your worker node as well. Since MIME multiparts are executed in order, this means your bootstrapping happens first and the EKS bootstrapping becomes a no-op.

As a result, your worker nodes don't have the required labels for EKS to associate them with a node group.

You can fix this by specifying the worker AMI you'd like to use within your launch template and pass that to EKS. See this documentation for more details.

a13x5 · 2021-08-23T08:44:34Z

I've removed release_version parameter from eks-node and added image_id to my launch template. And it's all work just fine now. I tested cluster creation and updating.
Thank you very much @suket22 !

a13x5 closed this as completed Aug 23, 2021

This was referenced Feb 2, 2022

Provide a way for node_user_data to execute before bootstrapping pulumi/pulumi-eks#655

Closed

Provide a way to run user_data before EKS bootstrapping pulumi/pulumi-eks#656

Open

Xkynar mentioned this issue Apr 6, 2022

[FEATURE] Provide example for additional security group to managed node group aws-ia/terraform-aws-eks-blueprints#328

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instance failed to join status despite that instance is actually joined #719

Instance failed to join status despite that instance is actually joined #719

a13x5 commented Aug 2, 2021

ravisinha0506 commented Aug 2, 2021

a13x5 commented Aug 3, 2021

deepanverma19 commented Aug 4, 2021 •

edited

Loading

suket22 commented Aug 10, 2021

a13x5 commented Aug 23, 2021

Instance failed to join status despite that instance is actually joined #719

Instance failed to join status despite that instance is actually joined #719

Comments

a13x5 commented Aug 2, 2021

ravisinha0506 commented Aug 2, 2021

a13x5 commented Aug 3, 2021

deepanverma19 commented Aug 4, 2021 • edited Loading

suket22 commented Aug 10, 2021

a13x5 commented Aug 23, 2021

deepanverma19 commented Aug 4, 2021 •

edited

Loading