Release v0.9.0 #66

rainest · 2021-10-08T19:29:27Z

0.9.0 updates the chart to 2.4.0. This includes the KIC 2.0 release and CRD/admission/etc. API version updates for Kubernetes 1.22 compatibility.

It updates the operator-specific Kong CRD to apiextensions.k8s.io/v1. It still lacks a structural schema (it'd need one for the entirety of values.yaml, which would be quite an undertaking), so the entire spec has x-kubernetes-preserve-unknown-fields: true. I used the same strategy as we used for the KIC CRDs to upgrade it: I uploaded it to a 1.21 cluster, let Kubernetes auto-upgrade it, pulled down the upgraded version, and stripped instance-specific metadata. AFAIK we've never generated this automatically from anything beyond the initial stub version created by operator-sdk. We did use an older version of the SDK to create it, though a version created with something more recent isn't hugely different (mostly it just breaks out some metadata in the schema--the main spec part remains empty and just preserves unknown fields).

This remains a draft because the Kong CRD is still not fully 1.22 compatible. We originally used charts.helm.k8s.io for the CRD group, and the k8s.io suffix now has special requirements. Attempting to apply it as-is will result in this error:

$ kubectl create -f /tmp/crd.yaml 
The CustomResourceDefinition "kongs.charts.helm.k8s.io" is invalid: 
* metadata.annotations[api-approved.kubernetes.io]: Required value: protected groups must have approval annotation "api-approved.kubernetes.io", see https://github.com/kubernetes/enhancements/pull/1111

We presumably have no need for an approved API hostname, so it makes sense to move that elsewhere. There isn't a whole lot of guidance about how to choose a group (Operator SDK docs and Kubernetes docs just use example.com without additional guidance, another operator I reviewed uses the operatorhub.io domain, so I guess we probably use either charts.configuration.konghq.com or charts.konghq.com.

Changing the group does mean that prior version CRs would not automatically upgrade. Not sure if we want to automate this or just configure the metadata and write changelog instructions to indicate that you must manually create a new release using the updated CR and copy in your spec.

Lastly, similar to #64, we lack automation to push the image, and will need to build and push that manually.

generator command: 'kong-2.4.0' generator command version: 2b9dc2a

rainest · 2021-10-08T19:39:47Z

MicroK8S CI also seems to be broken at the cluster setup step; need to figure out what's broken there.

mflendrich

So it seems to me that we have 2 ways to proceed:

release the chart that's compatible with 1.22 BUT requires that the user reapplies all their Kong CRs using the new API group (this is an important breaking change and, I'd say, therefore requires a major version bump),
try figure out if we can achieve 1.22 compatibility in a way that allows the user to upgrade transparently (which I'm not particularly optimistic about).

Unless we achieve breakthrough in the latter way, we'll probably need to go for the former (~~and I strongly prefer that we bump the major version in that case~~).

Release 0.9.0 and change the kongs CRD API group from charts.helm.k8s.io to charts.konghq.com. The new group is not in one of the protected groups established by kubernetes/enhancements#1111. This operator CRD should not use a protected group as it is not a core part of the Kubernetes project. This change makes the CRD compatible with Kubernetes >=1.22. However, it breaks compatibility with previous versions of the operator. As such, 0.9.0 has no replace version: it requires a fresh operator install and a fresh set of Kong CRs.

rainest · 2021-10-11T22:18:45Z

The first option (new CRD) appears to be what others have done (see #65 (comment)). I'm still unsure on the version since a 1.x release would sorta imply more production readiness, and we've technically never moved this out of the alpha phase (though in practice it's a wrapper around the chart, and we do consider the chart production-ready).

The changelog has a draft of a migration strategy. I need to test that this actually works.

Tests are now failing because Ingresses aren't getting status updates, which I can replicate locally. Not sure why, since there are no controller errors. Maybe there's a behavior change where we previously set ClusterIP proxy Service IPs as Ingress load balancer status IPs, but don't any longer in 2.0?

Edit:

Apparently sort of. On 2.x:

status:
  loadBalancer: {}

On 1.x:

status:
  loadBalancer:
    ingress:
    - {}

The latter passes the regex check because the ingress object is set, albeit to nothing. Not really sure which is more correct--neither really indicates anything of interest.

Remove the Ingress status waits and add retry configuration to curl when validating the Ingress configuration. KIC 2.0+ handles status updates for non-LoadBalancer Services differently than earlier versions. Previously, KIC would set a status with a 0-length list of ingresses if the proxy Service was not type Loadbalancer, e.g. status: loadBalancer: ingress: - {} As of KIC 2.0, no status is set if the Service is not type LoadBalancer, e.g. status: loadBalancer: {} This change to the operator tests confirms that Ingress configuration was successfully applied to the proxy using requests through the proxy only. These now run immediately after the upstream Deployment becomes available, however, so the curl 200 checks now retry to account for lag before the controller ingests that configuration. --retry-all-errors retries on either 404s (Ingress not yet ingested) or 503s (Endpoints not yet added). The 404 checks are unchanged, since they should succeed immediately once the 200 checks do.

This reverts commit 8a0d10f.

rainest · 2021-10-12T21:35:23Z

Reworking the test to use curl retries did not work. We'd need --retry-all-errors to make retry take action on 404s.

https://curl.se/docs/manpage.html#--retry-all-errors is not available on the version of curl in Ubuntu 20.04:
https://packages.ubuntu.com/search?keywords=curl

Would probably need to either use https://github.com/marketplace/actions/retry-step, rework how 2.x does no-IP status to match the 1.x behavior, or rework the curl tests to work with wait_for.

Remove the Ingress status waits and add retry configuration to curl when validating the Ingress configuration. KIC 2.0+ handles status updates for non-LoadBalancer Services differently than earlier versions. Previously, KIC would set a status with a 0-length list of ingresses if the proxy Service was not type Loadbalancer, e.g. status: loadBalancer: ingress: - {} As of KIC 2.0, no status is set if the Service is not type LoadBalancer, e.g. status: loadBalancer: {} This change to the operator tests confirms that Ingress configuration was successfully applied to the proxy using requests through the proxy only. These now run immediately after the upstream Deployment becomes available, however, so they may run before the controller has ingested Ingress configuration or observed Endpoint updates. To account for this, the curl checks are now wrapped in wait_for to allow a reasonable amount of time for the controller to update configuration.

shaneutt · 2021-10-13T19:51:18Z

quasi-related PR for something I found during testing, we should include it in the release: #68

mflendrich

I retract my strong preference for a major version bump - after further consideration, this breaking change (of changing the apigroup) is OK to make between 0.x, and is probably a better tradeoff.

Let's include #68 and go with v0.9.0.

olm/0.9.0/kong.v0.9.0.clusterserviceversion.yaml

Co-authored-by: Michał Flendrich <michal@flendrich.pro>

rainest · 2021-10-13T21:15:16Z

Manual testing of the upgrade steps is documented below:

log.txt
opertest.tar.gz

I wasn't able to test a complete OLM install of 0.9.0 with the SDK version we use, so that's simulated to confirm that the resulting Kong CRs result in what we expect. That shouldn't be an issue since the image has no problem reading the current CRD API (else it would presumably be unable to act on the CRs) and because we don't upload any OLM content other than the YAML CSV and CRD--anything OLM machinery parsing is presumably handled by the actual OLM install, i.e. OperatorHub.io in the real world, and it is up to date.

I briefly looked at migrating to 1.0, but it's involved enough that it'd take a while to verify and would make the diff look like garbage, so in the interest of time and PR clarity, I think we should defer that.

charts(kong): update to kong-2.4.0

176b28b

generator command: 'kong-2.4.0' generator command version: 2b9dc2a

rainest mentioned this pull request Oct 8, 2021

Operator projects using the removed APIs in k8s 1.22 requires changes. #65

Closed

rainest force-pushed the release/v0.9.0 branch from eb34651 to 0eabaeb Compare October 11, 2021 20:32

mflendrich reviewed Oct 11, 2021

View reviewed changes

Travis Raines added 4 commits October 11, 2021 14:31

test: update microk8s to 1.22

db3cc1b

test: update kubectl to 1.22.2

25c52ad

test: update Ingress API version

a8d515a

rainest force-pushed the release/v0.9.0 branch from bc1d30d to a8d515a Compare October 11, 2021 21:31

feat: support Ingress v1

b404afe

Travis Raines added 2 commits October 12, 2021 13:31

Revert "test: remove Ingress waits"

d454c0b

This reverts commit 8a0d10f.

Travis Raines added 3 commits October 13, 2021 10:21

test: fix syntax error

05e2240

test: source helper script

2223d3d

mflendrich approved these changes Oct 13, 2021

View reviewed changes

olm/0.9.0/kong.v0.9.0.clusterserviceversion.yaml Outdated Show resolved Hide resolved

olm/0.9.0/kong.v0.9.0.clusterserviceversion.yaml Outdated Show resolved Hide resolved

Travis Raines and others added 4 commits October 13, 2021 13:54

doc: improve upgrade guide based on test run

37f2550

fix: update ingress example in README.md to v1

7fbb435

feat: update OLM maintainer info

2f83358

Co-authored-by: Michał Flendrich <michal@flendrich.pro>

feat: update support info

3f83341

rainest marked this pull request as ready for review October 13, 2021 21:07

rainest requested a review from a team as a code owner October 13, 2021 21:07

rainest merged commit f4b6ba6 into main Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.9.0 #66

Release v0.9.0 #66

rainest commented Oct 8, 2021

rainest commented Oct 8, 2021

mflendrich left a comment •

edited

Loading

rainest commented Oct 11, 2021 •

edited

Loading

rainest commented Oct 12, 2021

shaneutt commented Oct 13, 2021

mflendrich left a comment

rainest commented Oct 13, 2021

Release v0.9.0 #66

Release v0.9.0 #66

Conversation

rainest commented Oct 8, 2021

rainest commented Oct 8, 2021

mflendrich left a comment • edited Loading

Choose a reason for hiding this comment

rainest commented Oct 11, 2021 • edited Loading

rainest commented Oct 12, 2021

shaneutt commented Oct 13, 2021

mflendrich left a comment

Choose a reason for hiding this comment

rainest commented Oct 13, 2021

mflendrich left a comment •

edited

Loading

rainest commented Oct 11, 2021 •

edited

Loading