Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest AMIs (v20190220) missing commits in /etc/eks/bootstrap.sh #233

Closed
cplee opened this issue Mar 29, 2019 · 13 comments
Closed

Latest AMIs (v20190220) missing commits in /etc/eks/bootstrap.sh #233

cplee opened this issue Mar 29, 2019 · 13 comments

Comments

@cplee
Copy link

cplee commented Mar 29, 2019

What happened:
Launched instances with:

ami-05fe3f841ac4df3bb | amazon-eks-node-1.11-v20190327
ami-0d9f458329e942f90 | amazon-eks-node-1.12-v20190327

Found that the instances no longer support the --enable-docker-bridge arg. When i SSH into the instance, the bootstrap.sh file looks like it is missing that as a supported arg.

What you expected to happen:
I expect the /etc/eks/bootstrap.sh to look like AMIs with v20190220 with support for --enable-docker-bridge

How to reproduce it (as minimally and precisely as possible):
Launch instances with new v20190327 AMIs and run cat /etc/eks/bootstrap.sh

Anything else we need to know?:
I launched instances manually with v20190220 to verify I wasn't crazy and confirmed that the arg was there on that AMI. Also, tried in us-west-2 and reproduced same issue.

Environment:

  • AWS Region: us-east-1
  • Instance Type(s): m5-large
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.2
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): 1.11 and 1.12
  • AMI Version:
  • Kernel (e.g. uname -a): Linux ip-10-0-3-76.ec2.internal 4.14.104-95.84.amzn2.x86_64 Template is missing source_ami_id in the variables section #1 SMP Sat Mar 2 00:40:20 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /tmp/release on a node):
cat: /tmp/release: No such file or directory
@cplee
Copy link
Author

cplee commented Mar 29, 2019

@micahhausler would love some guidance on this one. I'm struggling to find an AWS AMI for worker nodes that I can use. Seems between #183 affecting v20190211 and #193 affecting v20190220 there are a lot of landmines out there. How should I proceed?

@cplee
Copy link
Author

cplee commented Mar 29, 2019

not sure if this helps, but the GPU AMIs do have the correct bootstrap.sh:

|  2019-03-28T02:40:48.000Z |  ami-00f74c3728d4ca27d |  amazon-eks-gpu-node-1.10-v20190327                  |
|  2019-03-28T02:44:47.000Z |  ami-06ec2ea207616c078 |  amazon-eks-gpu-node-1.11-v20190327                  |
|  2019-03-28T02:45:03.000Z |  ami-0cb7959f92429410a |  amazon-eks-gpu-node-1.12-v20190327                  ```

@matheuss
Copy link

matheuss commented Mar 29, 2019

I'm on AWS Support Chat right now about this issue – it took me 4 hours to figure out that bootstrap.sh is old, which causes it to see the cluster name as --enable-docker-bridge, which causes kubelet to get Unauthorized from the API server, given the cluster name is --enable-docker-bridge on the aws-iam-authenticator call on /var/lib/kubelet/kubeconfig 😕

I'm using ami-02a28cb577cff1b98 on eu-west-2.

@mmcaya
Copy link

mmcaya commented Mar 29, 2019

For reference, cat /etc/eks/release from amazon-eks-node-1.11-v20190327 (ami-05fe3f841ac4df3bb) us-east-1

BASE_AMI_ID="ami-027c5e2ccf2970def"
BUILD_TIME="Wed Mar 27 23:10:55 UTC 2019"
BUILD_KERNEL="4.14.104-95.84.amzn2.x86_64"
AMI_NAME="amazon-eks-node-1553728179"
ARCH="x86_64"

@max-rocket-internet
Copy link
Contributor

I know EKS is relatively new and everything but the amount of issues we have in the AMIs is too much.

@mcrute
Copy link
Contributor

mcrute commented Mar 29, 2019

I'm looking into this... will follow up when we have a fix ready.

@mcrute
Copy link
Contributor

mcrute commented Mar 31, 2019

Updated AMIs have been released for 1.10, 1.11, and 1.12. These AMIs have the latest changes for bootstrap.sh as well as the updates for ulimits.

@mcrute mcrute closed this as completed Mar 31, 2019
@edisongustavo
Copy link

@mcrute GPU AMIs have not been released, shouldn't they be released as well?

@mcrute
Copy link
Contributor

mcrute commented Apr 1, 2019

@edisongustavo the GPU AMIs were not impacted by this issue, only the standard AMIs so I only updated the standard ones.

@edisongustavo
Copy link

@mcrute Ok, so I should assume that both versions (CPU and GPU) have different release lifecycles then.

@mcrute
Copy link
Contributor

mcrute commented Apr 1, 2019

@edisongustavo we don't anticipate different release lifecycles for GPU and CPU going forward outside of this patch release for the CPU AMI. There are some slight tooling differences that we're working to resolve to keep the images in-sync.

@edisongustavo
Copy link

I understand that @mcrute, that's great.

I don't know how different people use this, but we're using Terraform and this is how we find the AMIs to launch workers:


data "aws_ami" "eks-ami" {
  most_recent = true

  filter {
    name   = "name"
    values = ["amazon-eks-node-${var.k8s_version}-${var.eks_ami_version}"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["602401143452"] # AWS EKS account
}

Which breaks if they have different versions, so I've added different versions for each "flavor": cpu or gpu.

@whereisaaron
Copy link

@mcrute how do people find the fixed AMI's from 31 March or later? The newest in the AMI registry is 29 March (e.g. amazon-eks-node-1.12-v20190329), which still has the ulimit problem. You said new AMI's were release 31 March, but they don't show. It's June now, but no new AMI's since 29 March?

The [project home page] and releases page lists the latest AMI's as 27 March. And the AWS Marketplace says the latest AMI version is 20 February.

image

image

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants