Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TaskRun Execution via Supervisor / Conductor Container #2391

Closed
2 tasks
waveywaves opened this issue Apr 14, 2020 · 13 comments
Closed
2 tasks

Improve TaskRun Execution via Supervisor / Conductor Container #2391

waveywaves opened this issue Apr 14, 2020 · 13 comments
Assignees
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/design Categorizes issue or PR as related to design. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@waveywaves
Copy link
Member

waveywaves commented Apr 14, 2020

Expected Behavior

Currently TaskRuns have no central execution manager / "runtime" which takes care of the execution of the Steps in the Task (which need to be in a certain order, or a manipulated order in case of debug). This is an issue as the logic necessary to execute the steps lie within the Steps themselves through the Entrypoint Image whereas there should be an external supervisor / conductor which would do the "orchestration".

Steps in a TaskRun execute based on the instructions given by the controller directly, instead of instructions given by the TaskRun Runtime itself (does not exist yet). This runtime can be provided by the Supervisor Container. Setting order of execution should be delegated to the Runtime and not be done by the controller itself in this case.

It should also be possible to expose a debug port from the Supervisor Container when in Debug Mode so that User Interfaces can attach to it and users can debug the Steps based on their own discretion, in the same way they would debug something like a C Program with GDB.
Debug Mode for TaskRuns being tracked in #2069.

Actual Behavior

Order of execution of the steps is set by the controller through wait-files in a shared volume. A Step will execute if the wait-file it was waiting on comes into existence. This is completelty autonomous behaviour and there it is not possible to manipulate/halt the order of execution during runtime.

Checkpoints

@waveywaves
Copy link
Member Author

cc @skaegi @imjasonh

@afrittoli
Copy link
Member

Would it be possible to inject the supervisor container only in case of debug, and have Tekton working with entrypoints alone for now?
That would give us the possibility to get to know better the "supervisor mode" so we'd have more data to decide whether we want it for Tekton in normal mode too or not.

@imjasonh
Copy link
Member

Would it be possible to inject the supervisor container only in case of debug, and have Tekton working with entrypoints alone for now?
That would give us the possibility to get to know better the "supervisor mode" so we'd have more data to decide whether we want it for Tekton in normal mode too or not.

I think I'd actually prefer the opposite: if we think a supervisor will be easier/better, explore and experiment with it without adding debug mode, eventually replace the file-based signalling we do today, and then and only then, start to support pausing/unpausing steps. That separates the infrastructure upgrade work from the feature work, and lets us bail early if the infra change turns out to be not worth doing.

@ghost ghost added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/design Categorizes issue or PR as related to design. labels Apr 17, 2020
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 14, 2020
@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 14, 2020
@vdemeester
Copy link
Member

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

@tekton-robot tekton-robot reopened this Aug 17, 2020
@tekton-robot
Copy link
Collaborator

@vdemeester: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 17, 2020
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 15, 2020
@vdemeester
Copy link
Member

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2020
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2021
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/design Categorizes issue or PR as related to design. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants