Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider pinning kind images to a sha #8815

Closed
killianmuldoon opened this issue Jun 7, 2023 · 8 comments
Closed

Consider pinning kind images to a sha #8815

killianmuldoon opened this issue Jun 7, 2023 · 8 comments
Assignees
Labels
triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@killianmuldoon
Copy link
Contributor

We currently set KIND image versions either by their tag or using logic to retrieve the latest stable image for a certain minor release of Kubernetes. This is against KIND's guidelines as outlined in the release notes:

New node images have been built for KIND v0.19.0, please use these exact images (IE like kindest/node:v1.26.3@sha256: 61b92f38dff6ccc29969e7aa154d34e38b89443af1a2c14e6cfbd2df6419c66f including the digest) or build your own as we may need to change the image format again in the future sweat_smile

And also in this comment from @BenTheElder #8788 (comment)

We should consider pinning the images we use from KIND. Currently we set the version in a few places:

  1. The default image tag used in CAPD and the e2e test KIND provider
  2. When resolving the version for upgrade tests
  3. When resolving the version for images in CAPD.

(I may have missed some other places)

We could solve (1) fairly easily by just pinning the tag to a known kindest/node release with the relevant sha.

For (2) and (3), though we'd need a source of truth - just a yaml file or similar - with a list of Kubernetes versions linked to the current KIND release. This would need to be updated at least for every KIND release and likely Kubernetes release (when a kindest/node image gets published. I'm not sure if the images have any other markers or labels we could use to distinguish this.

For the e2e tests when we don't have an images for the latest version of Kuberntes available it can be built on the fly. This will make upgrade tests take longer and use more CPU etc., but these tests only run once a day so they're not that intensive.

For resolving images for CAPD we would probably need to return good errors / warnings so users know that some images that may have worked in the passed will no longer be used by CAPI e.g. after upgrading to KIND v0.20.0 there may not be a recommended v1.27.0 image.

We can tackle all three of these problems separately, as they're decoupled. The only one which has hard tradeoffs IMO is (3) so it would be good to hear more opinions on ways to deal with it.

Ref: #8788

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 7, 2023
@fabriziopandini
Copy link
Member

/triage accepted

If it is required to make us compliant with kind recommendations, I'm +1
I'm happy to pair in brainstorming possible solutions

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 12, 2023
@sbueringer
Copy link
Member

For 2 I would prefer to always build ourselves. Except there is a low maintenance way to automatically use compatible images or build ourselves.

I wonder how this will affect CAPD users (quick start). Would they have to specify the version with Sha? Or would we have an allow list with Kubernetes versions that we automatically map to hardcoded Shas? I think it's tricky as we still want to make sure self-build images work as well

@killianmuldoon
Copy link
Contributor Author

I wonder how this will affect CAPD users (quick start). Would they have to specify the version with Sha? Or would we have an allow list with Kubernetes versions that we automatically map to hardcoded Shas? I think it's tricky as we still want to make sure self-build images work as well

I think mapping versions -> known SHAs is the only real way for us to do this. The easiest way to do so is to keep an updated list in the repo, but it requires some manual effort - mostly the KIND bump + when using a new version of Kubernetes.

I'm also fine with making the e2e tests always-build. We should enable pinning images by sha at the same time though - some logic like if it has a SHA use it, if not always build - so we can speed up upgrade tests running locally etc.

@sbueringer
Copy link
Member

sbueringer commented Jun 13, 2023

The easiest way to do so is to keep an updated list in the repo, but it requires some manual effort - mostly the KIND bump + when using a new version of Kubernetes.

Hm yup. Maybe it's fine if it's just part of the kind bump to update the list in some go file

some logic like if it has a SHA use it, if not always build

Yup. Maybe just a small test helper binary that reads the list in the go file and tells us if we have to build ourselves

Just had another idea for users.

  • I think it would be ideal if they still only have to specifcy the version without sha
  • We could check that version in some CAPD webhook if it's in the list of our versions with a pinned sha and return a warning if it isn't. Shouldn't be a problem to have a CAPD webhook which is called e.g on Cluster create/update (could be enough for CC clusters).

@killianmuldoon
Copy link
Contributor Author

/assign

I'd like to get this done ahead of the KIND v0.20.0 release so we don't have to deal with the issues described in #6264 until we update.

Basic approach:

  1. Add a SHA to the default kind image used in CAPD / e2e
  2. Always build images for upgrade tests, unless they specify a SHA.

We can treat automatically resolving known-good images as a follow up as it's not completely essential here. It may begin to impact CAPD users soon though.

@killianmuldoon
Copy link
Contributor Author

/close

The following were done to resolve this issue:

  1. Add always build to K8s upgrade tests: 🐛 Always build Kind images for upgrade tests #8859
  2. Add a KindMapper in CAPD which always tries to resolve a Kind version to a specific sha: 🐛 Add kind mapper #8880
  3. Pin templates in older versions of Cluster API to the relevant shas for e2e tests: 🐛 Pin kindest/node images to known good versions in clusterctl upgrade tests #8860

Going forward the list of KIND shas will need to be updated with each new KIND release so that we're normally using the latest images available. We might be able to find some way to automate this in future.

@k8s-ci-robot
Copy link
Contributor

@killianmuldoon: Closing this issue.

In response to this:

/close

The following were done to resolve this issue:

  1. Add always build to K8s upgrade tests: 🐛 Always build Kind images for upgrade tests #8859
  2. Add a KindMapper in CAPD which always tries to resolve a Kind version to a specific sha: 🐛 Add kind mapper #8880
  3. Pin templates in older versions of Cluster API to the relevant shas for e2e tests: 🐛 Pin kindest/node images to known good versions in clusterctl upgrade tests #8860

Going forward the list of KIND shas will need to be updated with each new KIND release so that we're normally using the latest images available. We might be able to find some way to automate this in future.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fabriziopandini
Copy link
Member

@killianmuldoon kudos for all the hard work here, this is keeping the light on for the next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants