Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEP-0086: Changing the way result parameters are stored #521

Merged
merged 6 commits into from
May 9, 2022

Conversation

tlawrie
Copy link
Contributor

@tlawrie tlawrie commented Sep 27, 2021

Initial TEP PR for issue tektoncd/pipeline#4012

@tekton-robot tekton-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 27, 2021
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 27, 2021

CLA Signed

The committers are authorized under a signed CLA.

@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 27, 2021
@vdemeester vdemeester added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Sep 27, 2021
@skaegi
Copy link
Contributor

skaegi commented Sep 27, 2021

/assign @afrittoli

@skaegi
Copy link
Contributor

skaegi commented Sep 27, 2021

/assign @vdemeester

@bobcatfish
Copy link
Contributor

/assign

@tlawrie tlawrie marked this pull request as ready for review September 29, 2021 02:07
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 29, 2021
@tlawrie
Copy link
Contributor Author

tlawrie commented Sep 29, 2021

Submitting PR based on Alternate API WG discussion with 0086 in 'proposed' status

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for starting this!

I think having a flexible mechanism for to share results between tasks and pipelines is an important feature, and being able to store larger results would be super-nice.

Even if binary artefacts are mentioned as a non-goal, I think this goes in the direction of artefacts anyways, and I wonder if should provide a way for users to plug-in their preferred storage mechanism, whether it's etcd (ConfigMap, CRD, status or so), a database or some kind of artefact storage.

teps/0086-changing-the-way-result-parameters-are-stored.md Outdated Show resolved Hide resolved
teps/0086-changing-the-way-result-parameters-are-stored.md Outdated Show resolved Hide resolved
teps/0086-changing-the-way-result-parameters-are-stored.md Outdated Show resolved Hide resolved
teps/0086-changing-the-way-result-parameters-are-stored.md Outdated Show resolved Hide resolved
teps/0086-changing-the-way-result-parameters-are-stored.md Outdated Show resolved Hide resolved

1. N Configmaps: Per TaskRun or Per pipeline with Patch Merges (concern on parallelism even with queued Patch Merges)

- Suppose hypothetically that the pipelines controller were to create N ConfigMaps** for each of N results, it could also grant the workload access to write to these results using one of these focused Roles:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of a pipeline, it would be one ConfigMap for each result defined in each of the Tasks, plus one for each result defined at pipeline level? What is the reason for having so many different config maps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had assumed it would be one ConfigMap for all Results for a specific Task. The idea behind this was security i.e. a write once at end of TaskRun only by that approved Task.

But I am sure the others from the issue can also add additional detail here.

teps/0086-changing-the-way-result-parameters-are-stored.md Outdated Show resolved Hide resolved
- Concern around having to create a new ConfigMap**, Role and RoleBinding per TaskRun, when at the end of the day we don't actually care about updating that ConfigMap**, but the TaskRun's results.
- This option won't 'scale fail' which would happen with self definition update when the definition itself starts to get too big in unrelated areas (aggregate limit).

2. CRD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean creating a dedicated CRD to store results?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes


There is the current alternative of storing result parameters as data in a workspace, however Workspaces
- require there has to be a storage mechanism in the cluster that can be shared between Tasks. That can be complex, or have performance issues in itself if using an As A Service that orders the storage at spin-up time. Or forces Tasks to all run on the same node. etc. Storage is a big complex adoption hurdle.
- changes the way end users refer to the result parameter or pass between containers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 100% we should not change the interface for retrieving results.
For producing results, if we decide to provide a new interface, we should maintain backward compatibility anyways.

Since tasks can be written in any programming language, it feels to me that the current file-system based interface works well across languages, and it can be used without the need of extra dependencies.

I wonder if it would make sense to provide a pluggable interface on Tekton side?
We could keep both read and write interfaces from Tasks as they are, provide a default storage implementation as Tekton and let users plug-in other kind of storage based on their requirements (size, speed).

- Could run as part of the controller
- Could be a separate HTTP server(s) which write to TaskRuns (or even config maps); task pods connect to this server to submit results, this server records the results (means result size would be limited to what can be stored in the CRD but there probably needs to be an upper bound on result size anyway)

4. Self-update / mutate the TaskRun via admission controller (with the various controls i.e. first write, subsequent read only)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not quite sure how this would work - how would the admission controller know what result to write?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question? i just grabbed it from the issue. Should I change or remove it?

@tekton-robot tekton-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 17, 2021
@tlawrie
Copy link
Contributor Author

tlawrie commented Oct 18, 2021

Updated based on a number of the review comments.

@afrittoli
Copy link
Member

@tlawrie Something went wrong in the PR rebase perhaps? It now contains 27 commits, most unrelated to this TEP.

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

@tekton-robot tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 18, 2021
Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marked it as "on-hold" until the commits are sorted out

I noticed this TEP is not yet accepted and merged. This TEP is stuck in the
review process. Dropping the proposal and design details based on the
recent discussion in the API WG. This TEP instead now contains a list of
alternatives. If we can merge this without proposal, it opens up an opportunity
for the community to create a PoC and evaluate which solution fits best for
the set of the requirements.

Moving the proposal and design details under the first alternative, making it
more like a TEP within TEP.

One major change here is the requirement section in which task results as part
of the taskRun status is listed as a requirement with the flexible design for
the use cases which would like to avoid it.

Updated the toc.
Copy link
Member

@jerop jerop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlawrie @pritidesai @imjasonh - thank you for the hard work here!

Given that the possible solutions are all listed as alternatives and we have contributors interested in experimenting to help us figure out a way forward, I suggest that we approve this TEP in its current proposed state to indicate this is a direction we want to go and unblock the contributors to work on the experiments. What do you think @tektoncd/core-maintainers?

@afrittoli
Copy link
Member

/test .*

@pritidesai
Copy link
Member

We have approval from @vdemeester @bobcatfish and @jerop.

@ScrapCodes has a PoC created in the pipeline repo in the form of a WIP PR: tektoncd/pipeline#4838

Please help merge this PR so that we can move forward with the implementation.

@afrittoli
Copy link
Member

This was on hold until commits were sorted out which is the case now :)
/hold cancel

@tekton-robot tekton-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 5, 2022
@afrittoli
Copy link
Member

/test pull-community-teps-lint

@afrittoli
Copy link
Member

/test pull-community-org-validation

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can refer to tektoncd/pipeline#4808 and I would also like to see the idea of a "built-in" CSI volume driver as a possible alternative to explore.

But that can be done in follow-ups PR on this. We need to fix results in some way.
/approve

@tekton-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bobcatfish, jerop, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dibyom
Copy link
Member

dibyom commented May 6, 2022

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label May 6, 2022
@jerop
Copy link
Member

jerop commented May 6, 2022

/test pull-community-org-validation

Co-authored-by: Jerop Kipruto <jerop@google.com>
@tekton-robot tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label May 9, 2022
@tlawrie
Copy link
Contributor Author

tlawrie commented May 9, 2022

@tlawrie @pritidesai @imjasonh - thank you for the hard work here!

Given that the possible solutions are all listed as alternatives and we have contributors interested in experimenting to help us figure out a way forward, I suggest that we approve this TEP in its current proposed state to indicate this is a direction we want to go and unblock the contributors to work on the experiments. What do you think @tektoncd/core-maintainers?

@jerop I agree. Merging it in proposed lgtm. Overall, I have a vision that hopefully will be achieved with a pluggable interface with results stored byRef on the TaskRun but I know people are eager to start experimenting!

@tlawrie
Copy link
Contributor Author

tlawrie commented May 9, 2022

Thanks to everyone for all the feedback & conversation & improvements that this generated and how it evolved.

This is the first TEP I have ever worked on and I wanted to give big kudos to @bobcatfish (and now @jerop @imjasonh @mattmoor for the time they have put into guiding and designing with me.

I can't believe its been 11 months of discussion and work and growth to get to this point. I first raised an issue w.r.t. this on 8 June 2021!

@pritidesai
Copy link
Member

API WG - two pending comments to address in the follow up PR, ready to merge in proposed state.

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label May 9, 2022
@tekton-robot tekton-robot merged commit 66414b5 into tektoncd:main May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/s3c Issues or PRs that are related to Secure Software Supply Chain (S3C) kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: NEW
Status: UnAssigned
Status: Implementable
Development

Successfully merging this pull request may close these issues.