Guidance for writing Tasks: When is it appropriate to have multiple steps? #441

bobcatfish · 2020-07-23T21:19:22Z

Expected Behavior

It'd be nice to provide users with best practices around when it is appropriate to use multiple steps and when it makes sense to combine into one.

Christie's dream ☁️

It'd be great if each step of a Task could do one thing well and one thing only. Each step could be a small image that only does one thing, and the image itself is well tested. A Task then becomes a sequence of highly cohesive tiny images each doing one thing.

Actual Behavior

In #408 @vdemeester and I disagreed about which of these approaches to take, split between these two options:

Two steps, one of which uses a shell to create yaml configuration:

  - name: create-heartbeat-pod-yaml
    image: bash
    script: |
      FULL_RESOURCE_OUTPUT=$(cat /workspace/full-resource-output.json)
      LEASED_RESOURCE=$(cat /tekton/results/leased-resource)
      cat <<EOF > /workspace/heartbeat.yaml
      apiVersion: v1
      kind: Pod
      metadata:
        name: boskos-heartbeat-$LEASED_RESOURCE
      spec:
        containers:
        - name: heatbeat
          image: gcr.io/k8s-staging-boskos/boskosctl@sha256:a7fc984732c5dd0b4e0fe0a92e2730fa4b6bddecd0f6f6c7c6b5501abe4ab105
          args:
          - heartbeat
          - --server-url=$(params.server-url)
          - --owner-name=$(params.owner-name)
          - --resource=$FULL_RESOURCE_OUTPUT
          - --period=5m
      EOF
  - name: start-boskosctl-heartbeat
    image: lachlanevenson/k8s-kubectl
    args:
    - "apply"
    - "-f"
    - "/workspace/heartbeat.yaml"

One step, which is able to both use a shell and execute kubectl because the image already contains a shell:

  - name: create-heartbeat-pod-yaml
    image: lachlanevenson/k8s-kubectl
    script: |
      FULL_RESOURCE_OUTPUT=$(cat /workspace/full-resource-output.json)
      LEASED_RESOURCE=$(cat /tekton/results/leased-resource)
      cat <<EOF | kubectl apply -f -
      apiVersion: v1
      kind: Pod
      metadata:
        name: boskos-heartbeat-$LEASED_RESOURCE
      spec:
        containers:
        - name: heatbeat
          image: gcr.io/k8s-staging-boskos/boskosctl@sha256:a7fc984732c5dd0b4e0fe0a92e2730fa4b6bddecd0f6f6c7c6b5501abe4ab105
          args:
          - heartbeat
          - --server-url=$(params.server-url)
          - --owner-name=$(params.owner-name)
          - --resource=$FULL_RESOURCE_OUTPUT
          - --period=5m
      EOF

This is a pretty minor example, but you can imagine this happening with more steps + more images.

Image reality

Many images are very large
Sometimes these images can't help but be large b/c the tools they contain themselves have many dependencies
Using scripts requires having a shell
Using multiple tools requires having all of those tools available in one image, for example this image which contains kustomize, kubectl, gcloud and ko

Shell scripts are great when they are short; like any other code, they maintenance and testing. The longer our scripts get, the harder they are to maintain and test.

Script mode in Tekton is a great feature, but if we rely on it too much, our catalog will be full of Tasks which wrap large images and shell scripts that run inside them. (maybe that's okay?)

Additional Info

tektoncd/pipeline#2925 is about making it possible to pipe the stdout of one step into another; if we had this kind of functionality, then the step orchestration almost becomes like a script itself

chmouel · 2020-07-24T08:25:47Z

I agree shell script can be opaque when large and embedded in a large script section in a yaml but I believe there is arguments in favour of shell scripts too, an image with a GO binary that only wrap shell commands is arguably as hard to read and hard to tests than shell scripts and bring infrastructure complexities...

Ideally if we could have a binary using a 'native' library instead of relying on spawning commmands, for example on git-init binary to use libgit instead of using the git command, then this would bring value by being more testable and maintainable than a large shell script..

On the topic of each step should do only one thing and one thing only, i would love this as well.. but k8 is not that well optimized for this since this can become quite a slow process with all the overdraft, isnt it ?

tekton-robot · 2020-10-22T08:52:06Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

bobcatfish · 2020-10-22T18:34:43Z

I think it's reasonable to close this for now, I don't have any actions to take (or clear ideas yet), we can reopen this if we want to discuss more in the future :)

bobcatfish mentioned this issue Jul 23, 2020

Add Tasks to acquire and release boskos resources 🐑 #408

Merged

2 tasks

vinamra28 mentioned this issue Aug 10, 2020

Add Task to install Tekton components using Released Tekton Operator #480

Merged

8 tasks

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 22, 2020

bobcatfish closed this as completed Oct 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guidance for writing Tasks: When is it appropriate to have multiple steps? #441

Guidance for writing Tasks: When is it appropriate to have multiple steps? #441

bobcatfish commented Jul 23, 2020

chmouel commented Jul 24, 2020

tekton-robot commented Oct 22, 2020

bobcatfish commented Oct 22, 2020

Guidance for writing Tasks: When is it appropriate to have multiple steps? #441

Guidance for writing Tasks: When is it appropriate to have multiple steps? #441

Comments

bobcatfish commented Jul 23, 2020

Expected Behavior

Christie's dream ☁️

Actual Behavior

Image reality

Additional Info

chmouel commented Jul 24, 2020

tekton-robot commented Oct 22, 2020

bobcatfish commented Oct 22, 2020