Feature/fix deprecated apis 1.16 #110

Cryptophobia · 2019-09-21T15:40:24Z

https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/

Per the above recent blog post we need to migrate some of our APIs to work with the deprecations in k8s 1.16 .

Would really appreciate if other people looked at these changes and try to fix the broken tests. For now overwriting some url generations in the mock.py should fix the tests. We still have broken tests with these changes.

Steps to get a working development Django/Python environment:

Install Postgresql local database and initialize with postgres user then edit /var/lib/pgsql/data/pg_hba.conf to allow local connections
Install local django and python dependencies in /rootfs
```
 $ pip3 install -r requirements.txt --user
```

Set these environment variables and make sure Postgresql database is running on port 5432:

export DEIS_DATABASE_NAME=deis
export DEIS_DATABASE_PASSWORD=postgres
export DEIS_DATABASE_USER=postgres
export DEIS_DATABASE_SERVICE_HOST=localhost
export DEIS_DATABASE_SERVICE_PORT=5432

To run an individual test run this command in /rootfs:

python3 manage.py test --settings=api.settings.testing --noinput scheduler.tests.test_deployments

kingdonb

👍 I still need to test this more, but these changes look good to me

kingdonb · 2019-09-22T00:24:52Z

I don't think this is the final change, I was not able to see the Deployment api version apps/v1 being used when my upgraded controller created a new namespace and app deploy via Buildpack.

I'm not sure what's wrong but I revoke my checkmark, sorry for being premature.

kingdonb

I don't think I understand the mock tests, but I'll try deploying a completely fresh cluster in case there is some stale state left on my instance contributing to the problem.

felixbuenemann · 2019-09-23T10:57:52Z

@Cryptophobia if you are running both postgres and the python app on your local system, you don't need to modify pg_hba.conf, just make sure you have a database user that matches the user that is running the python app, you can then login to the db as that user without a password.

How do you plan on handling backwards-compatibility for pre-1.9.0 Kubernetes clusters?

The current change contain only one place, were the version is checked and the charts also hardcode the new value.

kingdonb · 2019-09-23T12:11:38Z

How do you plan on handling backwards-compatibility for pre-1.9.0 Kubernetes clusters?

I think I understood, and I'm happy to be proven wrong (although it will make this transition a bit more complicated)... that pre-1.9 K8s clusters didn't have extensions/v1beta1, but some earlier alpha API. I'm afraid this might not have been completely homogenous across all versions, and this goes back to we should really be checking with the cluster API to find out what APIs it has.

But if this was done right by upstreams, for definitions of right that include "they made it easy for us," then we can simply drop support for v1beta1 and substitute the v1 APIs in its place, and everything will be OK for all users on any version. Clusters with K8s versions older than 1.9 will have support for/be using deployment APIs that are older than v1beta1, and clusters with support for newer APIs will have no problem serving up the resources they manage with the newer APIs.

But like I said I'm happy to be proven wrong, and I have only done limited testing to understand the problem, it's entirely possible there are important points which I've glossed over (like, can a 1.12 cluster really serve up extensions/v1beta1 resources from the apps/v1 API? Or do you need to upgrade a cluster all the way to 1.16 to get the deprecated resources upgraded automatically?)

There are a lot of test scenarios which I haven't considered other than in passing, and it feels like this will be a good time for a minor version bump and an actual manual testing plan. I've started working on that.

I'm about 50/50 on whether we should go back and test those old versions. I maintained a 1.5 cluster for a long time and it was ultimately very easy to upgrade it to 1.13, once I got over RBAC, I've kept pace since then and it's on 1.15 now. I have yet to test this new controller on a 1.16 cluster which has removed the deprecated, and I suspect that in its current form it will not work. (What I'm saying in a round-about way is that versions <1.12 are not in support from upstream, and we should not go to great lengths to ensure we maintain support for them permanently)

So, that's where I'll start testing next, with a 1.16 cluster wherever I can put my hands on one.

felixbuenemann · 2019-09-23T13:48:42Z

I'm currently stuck with some clusters on 1.5.8, that are hard but not impossible to upgrade (mainly because they are based on an old version of kube-aws, so I need to start with a fresh cluster and transfer all apps and dbs, or keep backporting stuff from the new kube-aws), so I could test changes in a Staging cluster.

felixbuenemann · 2019-09-23T13:59:38Z

I just tried to change the apiVersion on deploy/deis-router from extensions/v1beta1 to apps/v1 and the change was rejected, so I think some usage of Capabilities.APIVersions.Has is needed inside the templates to use the proper apiVersion.

See: https://helm.sh/docs/chart_template_guide/#built-in-objects

felixbuenemann

I've added some comments with what I think is needed to handle backwards compatibility.

felixbuenemann · 2019-09-23T14:07:43Z

charts/controller/templates/controller-deployment.yaml

@@ -1,4 +1,4 @@
-apiVersion: extensions/v1beta1
+apiVersion: apps/v1


I think this should be something like this to keep backwars compatibility with older clusters:

Suggested change

apiVersion: apps/v1

{{- if .Capabilities.APIVersions.Has "apps/v1" }}

apiVersion: apps/v1

{{- else }}

apiVersion: extensions/v1beta1

{{- end }}

Going to try the above and see if it works for us. 👍 Sometime before we have had problems using . Capabilities.APIVersions.Has.

The docs mention that checking for resources (apps/v1/Deployment) requires at least Helm v2.15, but just using apps/v1 (without a resource) should be fine, if that's specific enough.

You can perhaps just check for "v1" – I think you might not find specific API groups, but you can count on support for apps/v1 if APIVersions has "v1"

There was an issue I found on the Prometheus helm chart repo that covered this, I will look to see what they settled on to solve the backwards compatibility question, and if it differs

Following this suggestion in 5930851 . 👍 I found this similar example in a Datadog agent chart that looks very similar to exactly what we are doing here:

https://github.com/helm/charts/blob/2215d0f752370ae75fa54a9f62b057ed14f8e744/stable/datadog/templates/agent-clusterchecks-deployment.yaml#L2-L9

I think they are adding the last else in the end to make sure it tries to get deployed using the default apiVersion as apps/v1 even if the deployer agent (Helm or whatever...) does not have permission to view .Capabilities.APIVersions?

rootfs/scheduler/mock.py

rootfs/scheduler/resources/ingress.py

rootfs/scheduler/resources/replicaset.py

Cryptophobia · 2019-09-23T15:38:45Z

Thank you for your notes @felixbuenemann . I wanted to add everyone involved in this PR and see how we can best do the backwards compatibility. Was not sure if we want to support < 1.9 cluster in the future anyways, but it is good to get feedback from users using Hephy.

I'm going to take your notes about the charts and versions and make some more changes.

till · 2019-09-27T09:52:50Z

Do people need to run the kubectl convert commands on existing installations?

kingdonb · 2019-09-30T14:27:40Z

That's a good question, I think the answer is yes, but more testing is needed. In the past we have decided which API version to use strictly by lowest-common-denominator strategy and paying attention to Kubectl API version. We have missed some opportunities to upgrade clusters >1.12 and preemptively ensure their clusters are using the non-beta APIs, before the deprecation.

I think it would be best if we supported both API versions, on clusters that are pre 1.16 and could support both, but the alternative would be to upgrade everyone on clusters that should support it. I'm not sure that we can reliably predict that based on version number, but it is a strategy that has worked for us in the past.

It sounds like we could provide a doc for people who are ready to upgrade deployment and other cluster resources to v1 API using kubectl convert, so they can do it at their convenience, and those who would be dragged kicking and screaming may get converted automatically if needed.

I didn't know there was a kubectl convert command, and the documentation for 1.16 has led to believe that objects in the deprecated APIs will still be available (eg. they would be automatically converted when the old support has dropped in an upgraded cluster.) I don't know if we need to do this in Hephy, but if there are envs which may be able to get upgraded by the controller in a way that is more-or-less seamless, that would be awesome.

Let's make it clear what happens for cluster operators, this would be a great addition in the documentation when we have worked out exactly what the v1 API transition plan should look like. I like the way we handled the postgres upgrades, and I don't have a problem with one-way-only transitions if necessary, but we should also consider what it might look like for an upgrade to go badly and how it could be safely reverted, if possible.

felixbuenemann · 2019-09-30T16:59:30Z

I don't think you need to do anything for a running cluster.

The kubectl convert command is deprecated with the following message:

kubectl convert is DEPRECATED and will be removed in a future version.
In order to convert, kubectl apply the object to the cluster, then kubectl get at the desired version.

It's not entirely clear from the comment, wether the cluster needs to support both the old and new representation or if new cluster versions will still accept old resource manifests but convert them on the fly.

kingdonb · 2019-09-30T18:14:29Z

Ah, so I have misunderstood. The API version is just a facade over a singular resource in the back-end, and can be queried with any supported API version. And then I guess, other than the restful resource interface itself, which changes shape in some ways, behavior should not be expected to change in any way when upgrading client API versions, (only kubelet versions.)

Cryptophobia · 2019-09-30T18:46:17Z

@felixbuenemann @kingdonb @till

The API version is just a facade over a singular resource in the back-end, and can be queried with any supported API version.

This is what my interpretation was from the deprecation messages. On the golang client for k8s, I remember querying same resources on different APIs endpoints and receiving the same payload/resources. I noticed this when querying the APIs for the ingress and service objects. I hope this is the case still because it will make things much easier for us. 🐿️

Cryptophobia · 2019-10-21T21:24:44Z

Made quite a bit of changes now. Can you guys try to test it out again on a 1.16 cluster if you can? @felixbuenemann @kingdonb

Testing this would require building a new image and deploying it using the new chart updates so that the controller get deployed on the right apiVersion: apps/v1 going forward.

felixbuenemann · 2019-10-22T11:47:55Z

@kingdonb I don't have a 1.16 cluster for testing ATM, do you?

Shouldn't we also test on older clusters to make sure nothing breaks?

Cryptophobia · 2019-10-22T13:30:24Z

Shouldn't we also test on older clusters to make sure nothing breaks?

Yes, we need to test older versions to make sure backwards compatibility works and also try to see if upgrade path to 1.16 from older versions like 1.15, 1.14 also works without problems just in case.

kingdonb · 2019-10-22T18:26:20Z

I do in fact have a Kubeadm cluster running 1.16.2 now, so I can run these tests on the next version.

edit:

And, I have an example of what error you will get if you try to install the latest stable Hephy release on it.

$ helm install hephy/workflow --namespace deis --set global.use_rbac=true,router.host_port.enabled=true
Error: validation failed: [unable to recognize "": no matches for kind "DaemonSet" in version "extensions/v1beta1", unable to recognize "": no matches for kind "Deployment" in version "extensions/v1beta1"]

No surprises there. This is using Helm v2.15.0, the last minor version that will handle releases with a Tiller pod. I manually installed this cluster onto some Ubuntu nodes using kubeadm v1.16.2.

kingdonb · 2019-10-27T15:49:01Z

So, I updated the charts in #111 and tried this updated controller out, there are still some things to work out. Pushing the example-go app to builder as a test, the slug builds successfully but the result of the deploy is an error:

remote: 2019/10/27 15:40:59 Error running git receive hook [The controller returned an error when publishing the release: Unknown Error (400): {"detail":"(app::deploy): unorderable types: str() < NoneType()"}]

This is distinct from the previous error, so the ball is moving forward! But there is something wrong here, and I'm not sure how to debug it further by myself.

kingdonb · 2019-11-02T15:30:50Z

I merged this with #111 but I can't verify that it works. It may be an issue in another component that causes the problems I'm seeing.

Signed-off-by: Cryptophobia <aouzounov@gmail.com>

Cryptophobia · 2020-10-01T01:32:25Z

Merging this in now that we have done testing and preparing for the new release Hephy Workflow v2.22.0!

Cryptophobia · 2020-10-01T01:50:56Z

Fixes #125 in new release Hephy Workflow v2.22.0 coming soon!

Cryptophobia added 3 commits September 20, 2019 14:16

ref(apis): deprecated apis in k8s 1.16

93682c6

ref(apis): trying to fix mock tests now

b4928ed

ref(charts): move from extensions/v1beta1 to apps/v1

27b8f8a

Cryptophobia requested review from till, duanhongyi, rwos, n0n0x, felixbuenemann, tolstenko, kingdonb and notmaxx September 21, 2019 15:40

ref(mock.py): sanitize pod urls to api_v1 in the mocker

a6fa780

kingdonb approved these changes Sep 21, 2019

View reviewed changes

kingdonb suggested changes Sep 22, 2019

View reviewed changes

felixbuenemann suggested changes Sep 23, 2019

View reviewed changes

Cryptophobia added 4 commits October 21, 2019 16:45

ref(apis): parse version out for replicasets

d8e65a9

ref(mock.py): reduce line length for style check

25a7f41

docs(README): add travis build status badge

fd3feba

ref(charts): add apiversion checks to helm charts

5930851

kingdonb mentioned this pull request Nov 5, 2019

chore(controller): permit umbrella override api group #111

Merged

Cryptophobia mentioned this pull request Feb 26, 2020

Helm install fails teamhephy/workflow#112

Closed

Cryptophobia mentioned this pull request Jun 15, 2020

Support for apps/v1 deployments #125

Closed

Cryptophobia mentioned this pull request Aug 3, 2020

It won't run on K8s 1.16 or later teamhephy/router#51

Closed

Cryptophobia force-pushed the feature/fix-deprecated-apis-1.16 branch from 7ce3c7e to 5930851 Compare October 1, 2020 01:28

fix(README): resolving a merge conflict

5613bc8

Signed-off-by: Cryptophobia <aouzounov@gmail.com>

Cryptophobia merged commit 20506f4 into teamhephy:master Oct 1, 2020

Cryptophobia mentioned this pull request Oct 14, 2021

Recreate all migrations for django 2.2 upgrade. #143

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/fix deprecated apis 1.16 #110

Feature/fix deprecated apis 1.16 #110

Cryptophobia commented Sep 21, 2019

kingdonb left a comment

kingdonb commented Sep 22, 2019

kingdonb left a comment

felixbuenemann commented Sep 23, 2019

kingdonb commented Sep 23, 2019 •

edited

Loading

felixbuenemann commented Sep 23, 2019

felixbuenemann commented Sep 23, 2019

felixbuenemann left a comment

felixbuenemann Sep 23, 2019

Cryptophobia Sep 23, 2019

felixbuenemann Sep 23, 2019

kingdonb Sep 25, 2019

Cryptophobia Oct 21, 2019

Cryptophobia commented Sep 23, 2019

till commented Sep 27, 2019

kingdonb commented Sep 30, 2019

felixbuenemann commented Sep 30, 2019

kingdonb commented Sep 30, 2019

Cryptophobia commented Sep 30, 2019

Cryptophobia commented Oct 21, 2019

felixbuenemann commented Oct 22, 2019

Cryptophobia commented Oct 22, 2019 •

edited

Loading

kingdonb commented Oct 22, 2019 •

edited

Loading

kingdonb commented Oct 27, 2019

kingdonb commented Nov 2, 2019

Cryptophobia commented Oct 1, 2020

Cryptophobia commented Oct 1, 2020

		@@ -1,4 +1,4 @@
		apiVersion: extensions/v1beta1
		apiVersion: apps/v1

Feature/fix deprecated apis 1.16 #110

Feature/fix deprecated apis 1.16 #110

Conversation

Cryptophobia commented Sep 21, 2019

kingdonb left a comment

Choose a reason for hiding this comment

kingdonb commented Sep 22, 2019

kingdonb left a comment

Choose a reason for hiding this comment

felixbuenemann commented Sep 23, 2019

kingdonb commented Sep 23, 2019 • edited Loading

felixbuenemann commented Sep 23, 2019

felixbuenemann commented Sep 23, 2019

felixbuenemann left a comment

Choose a reason for hiding this comment

felixbuenemann Sep 23, 2019

Choose a reason for hiding this comment

Cryptophobia Sep 23, 2019

Choose a reason for hiding this comment

felixbuenemann Sep 23, 2019

Choose a reason for hiding this comment

kingdonb Sep 25, 2019

Choose a reason for hiding this comment

Cryptophobia Oct 21, 2019

Choose a reason for hiding this comment

Cryptophobia commented Sep 23, 2019

till commented Sep 27, 2019

kingdonb commented Sep 30, 2019

felixbuenemann commented Sep 30, 2019

kingdonb commented Sep 30, 2019

Cryptophobia commented Sep 30, 2019

Cryptophobia commented Oct 21, 2019

felixbuenemann commented Oct 22, 2019

Cryptophobia commented Oct 22, 2019 • edited Loading

kingdonb commented Oct 22, 2019 • edited Loading

kingdonb commented Oct 27, 2019

kingdonb commented Nov 2, 2019

Cryptophobia commented Oct 1, 2020

Cryptophobia commented Oct 1, 2020

kingdonb commented Sep 23, 2019 •

edited

Loading

Cryptophobia commented Oct 22, 2019 •

edited

Loading

kingdonb commented Oct 22, 2019 •

edited

Loading