Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance for writing Tasks: When is it appropriate to have multiple steps? #441

Closed
bobcatfish opened this issue Jul 23, 2020 · 3 comments
Closed
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@bobcatfish
Copy link
Contributor

Expected Behavior

It'd be nice to provide users with best practices around when it is appropriate to use multiple steps and when it makes sense to combine into one.

Christie's dream ☁️

It'd be great if each step of a Task could do one thing well and one thing only. Each step could be a small image that only does one thing, and the image itself is well tested. A Task then becomes a sequence of highly cohesive tiny images each doing one thing.

Actual Behavior

In #408 @vdemeester and I disagreed about which of these approaches to take, split between these two options:

Two steps, one of which uses a shell to create yaml configuration:

  - name: create-heartbeat-pod-yaml
    image: bash
    script: |
      FULL_RESOURCE_OUTPUT=$(cat /workspace/full-resource-output.json)
      LEASED_RESOURCE=$(cat /tekton/results/leased-resource)
      cat <<EOF > /workspace/heartbeat.yaml
      apiVersion: v1
      kind: Pod
      metadata:
        name: boskos-heartbeat-$LEASED_RESOURCE
      spec:
        containers:
        - name: heatbeat
          image: gcr.io/k8s-staging-boskos/boskosctl@sha256:a7fc984732c5dd0b4e0fe0a92e2730fa4b6bddecd0f6f6c7c6b5501abe4ab105
          args:
          - heartbeat
          - --server-url=$(params.server-url)
          - --owner-name=$(params.owner-name)
          - --resource=$FULL_RESOURCE_OUTPUT
          - --period=5m
      EOF
  - name: start-boskosctl-heartbeat
    image: lachlanevenson/k8s-kubectl
    args:
    - "apply"
    - "-f"
    - "/workspace/heartbeat.yaml"

One step, which is able to both use a shell and execute kubectl because the image already contains a shell:

  - name: create-heartbeat-pod-yaml
    image: lachlanevenson/k8s-kubectl
    script: |
      FULL_RESOURCE_OUTPUT=$(cat /workspace/full-resource-output.json)
      LEASED_RESOURCE=$(cat /tekton/results/leased-resource)
      cat <<EOF | kubectl apply -f -
      apiVersion: v1
      kind: Pod
      metadata:
        name: boskos-heartbeat-$LEASED_RESOURCE
      spec:
        containers:
        - name: heatbeat
          image: gcr.io/k8s-staging-boskos/boskosctl@sha256:a7fc984732c5dd0b4e0fe0a92e2730fa4b6bddecd0f6f6c7c6b5501abe4ab105
          args:
          - heartbeat
          - --server-url=$(params.server-url)
          - --owner-name=$(params.owner-name)
          - --resource=$FULL_RESOURCE_OUTPUT
          - --period=5m
      EOF

This is a pretty minor example, but you can imagine this happening with more steps + more images.

Image reality

  • Many images are very large
  • Sometimes these images can't help but be large b/c the tools they contain themselves have many dependencies
  • Using scripts requires having a shell
  • Using multiple tools requires having all of those tools available in one image, for example this image which contains kustomize, kubectl, gcloud and ko

Shell scripts are great when they are short; like any other code, they maintenance and testing. The longer our scripts get, the harder they are to maintain and test.

Script mode in Tekton is a great feature, but if we rely on it too much, our catalog will be full of Tasks which wrap large images and shell scripts that run inside them. (maybe that's okay?)

Additional Info

tektoncd/pipeline#2925 is about making it possible to pipe the stdout of one step into another; if we had this kind of functionality, then the step orchestration almost becomes like a script itself

@chmouel
Copy link
Member

chmouel commented Jul 24, 2020

I agree shell script can be opaque when large and embedded in a large script section in a yaml but I believe there is arguments in favour of shell scripts too, an image with a GO binary that only wrap shell commands is arguably as hard to read and hard to tests than shell scripts and bring infrastructure complexities...

Ideally if we could have a binary using a 'native' library instead of relying on spawning commmands, for example on git-init binary to use libgit instead of using the git command, then this would bring value by being more testable and maintainable than a large shell script..

On the topic of each step should do only one thing and one thing only, i would love this as well.. but k8 is not that well optimized for this since this can become quite a slow process with all the overdraft, isnt it ?

@tekton-robot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 22, 2020
@bobcatfish
Copy link
Contributor Author

I think it's reasonable to close this for now, I don't have any actions to take (or clear ideas yet), we can reopen this if we want to discuss more in the future :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

3 participants