Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(controller): permit umbrella override api group #111

Merged
merged 10 commits into from
Oct 1, 2020

Conversation

kingdonb
Copy link
Member

@kingdonb kingdonb commented Oct 26, 2019

This is the last PR to submit in the series upgrading the charts for K8s 1.16 in Hephy Workflow v2.23, unless I missed one.

This probably should not be merged without validating the e2e suite, I think that does not need any upgrade though because it does not deploy, only creates a Pod which was already at v1 quite some time ago.

I believe that this will not work without further RBAC changes that must be made in a later commit. I have not done enough research to say what must change exactly. There are also further changes needed in router based on my testing, but again I can't say exactly what is needed just yet.

See teamhephy/monitor#14 for more details about the change, which is the first PR opened in the series. The full list of associated PRs is as follows:

requires teamhephy/monitor#14
requires teamhephy/minio#8
requires teamhephy/logger#8
requires teamhephy/fluentd#14
requires teamhephy/postgres#14
requires teamhephy/builder#52
requires teamhephy/nsqd#9
requires teamhephy/redis#3
requires teamhephy/registry-proxy#2
requires teamhephy/registry-token-refresher#4
requires teamhephy/registry#5
requires teamhephy/router#44
requires teamhephy/workflow-manager#5
requires teamhephy/workflow#105

@kingdonb
Copy link
Member Author

kingdonb commented Nov 2, 2019

So, when we talked about this PR, we discussed that maybe it shouldn't use a global, but a template instead (like we have done for rbacAPIVersion) and that maybe it shouldn't require the user to set a parameter, but autodetected (like we did with .Capabilities.APIVersions.Has for rbacAPIVersion).

I am going to take a stab at updating these PRs this morning so that it does that instead. I will submit it as a separate branch/separate series of PRs, so we can choose between them. This one at least apparently functions, as a reference, for what needs to change in the charts across all of the repos.

I would like to explore using Atomist.io to make changes like this that require a similar patch across all of the repos, but I don't have time to explore it right now. But if we find there is a need to make more patches that affect every component, it might be a major time saver to use it.

@kingdonb
Copy link
Member Author

kingdonb commented Nov 2, 2019

Well, I changed my mind and just updated these PRs in place, since it was a lot less work.

I am going to try to do a test of this now. Once it is tested and I can certify that I haven't made any errors, we should think about publishing a workflow-beta release anyway even if we can't isolate the remaining issues, it will be easier to debug them from a published release that doesn't work than try to cobble together all of the changed pieces.

This is really the problem that e2e was created to solve I think, but I don't know how to supply it with more than a handful (this series of PRs) without the elaborate Jenkins system that I think it's safe to say we have left behind at this point.

@kingdonb
Copy link
Member Author

kingdonb commented Nov 2, 2019

This chart update should work without teamhephy/workflow#105 now, btw.

@Cryptophobia Cryptophobia requested review from Cryptophobia, felixbuenemann and duanhongyi and removed request for Cryptophobia November 4, 2019 17:05
@Cryptophobia
Copy link
Member

Looks good! I prefer this way much better without a global parameter. Each component should be able to be deployed individually if possible as well.

@felixbuenemann
Copy link
Contributor

I've not been following the 1.16 update story to closely lately, 'cause I was busy with other stuff.

Can someone bring me up to speed on how I would deploy this and the updated dependencies to a cluster for testing?

Would it make sense to have some kind of pre-release for that?

@kingdonb
Copy link
Member Author

kingdonb commented Nov 5, 2019

Yes, definitely. We should push out a beta release as soon as we can be sure which components need a new tag.

To my knowledge the only image that has been updated for 1.16 is Controller, and I am not certain that it is finished. I'm not 100% sure whether the error I saw came from our updated Python code in controller or another source, as I am not really competent with a Python debugger.

I'm also not 100% sure how to roll the charts as for a new release, to make it easy to test. Usually what we would do here, is publish a chart release to the "workflow-beta" repo. I have actually done the test of all these chart patches manually instead, as in: taken the previous release and fetch --untar followed by applying patches from the baseline of each component manually, to perform my test

(Which failed, although all of the components installed successfully as apps/v1 resources, and all came online, I could not deploy an app. I have not tested again since I modified the chart to use a template instead of a global variable from the umbrella chart. If it is easy to publish a "workflow-beta" release, or if someone knows a less error-prone way to put one together, it would probably be Anton, either that or I'm digging through that Jenkins jobs repo again.)

The process I followed was to take my updated charts dir, and simply add the controller image to it. Then create an app from "example-go" and try to push a revision out. The issues I had are documented at #110, comment here: issuecomment-546707022 Like I may have said already, there may be other components that needed updates within the image besides controller, but off the top of my head I wouldn't be sure which (or why.)

@Cryptophobia
Copy link
Member

@felixbuenemann

Can someone bring me up to speed on how I would deploy this and the updated dependencies to a cluster for testing?

We need to make individual changes to the component charts and deploy. Hopefully I can get some free time soon to make a workflow-beta umbrella chart with all of the chart changes and a new test image version for the controller.

Would it make sense to have some kind of pre-release for that?

@kingdonb :

If it is easy to publish a "workflow-beta" release, or if someone knows a less error-prone way to put one together, it would probably be Anton, either that or I'm digging through that Jenkins jobs repo again.)

I plan to do this soon. I'll msg you guys when it is ready to go and your PRs are merged in @kingdonb .

...here may be other components that needed updates within the image besides controller, but off the top of my head I wouldn't be sure which (or why.)

I am guessing it may be the deis-router but that is only a guess right now.

@kingdonb
Copy link
Member Author

kingdonb commented Nov 6, 2019

Great! Thursday is "First Thursday" and we can have the regular meeting, if you guys want to attend we can talk about it then.

I am almost certain that Router is not the only problem left on the table, as I was testing this in Experimental Native Ingress mode, I am virtually certain I did not have the router component running at all.

@kingdonb
Copy link
Member Author

kingdonb commented Nov 6, 2019

I will get ahold of the controller logs before Thursday, since they should contain a backtrace with more information

@Cryptophobia
Copy link
Member

Cryptophobia commented Jun 25, 2020

@kingdonb , I merged all the PRs for the chart changes. Should we go ahead and assemble a workflow-beta master chart but with all of the old images for the components. The controller we will change to to the new tag for this commit with these changes inside the workflow-beta master chart. Then we can deploy to k8s 1.16+ cluster and run workflow e2e?

Or do you think we should proceed differently?

@kingdonb
Copy link
Member Author

Sounds good. If you can throw up a beta chart with or without the controller in it, I will begin testing and upgrade the components one by one (starting with the controller, which is most likely I think to still have issues that are undocumented)

Kingdon Barrett and others added 10 commits July 3, 2020 16:11
This probably should not be merged without validating the e2e suite,

I believe that this will not work without further RBAC changes that must
be made in a later commit. I have not done enough research to say what
must change exactly. There are also further changes needed in router
based on my testing, but again I can't say exactly what is needed.
@kingdonb
Copy link
Member Author

kingdonb commented Jul 3, 2020

I just noticed this was not going to merge cleanly without a rebase first. (Rebased!)

I'm stitching things together by hand to perform the test, so my results are preliminary only right now, but so far so good.

@Cryptophobia
Copy link
Member

@kingdonb , can you rebase with master branch to see if we need to include something from this PR now that we merged #110 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants