Skip to content

Commit

Permalink
Make phase condition reasons part of the API
Browse files Browse the repository at this point in the history
TaskRuns and PipelineRuns use the "Reason" field to complement the
value of the "Succeeded" condition. Those values are not part of
the API and are even owned by the underlying resource (pod) in
case of TaskRuns. This makes it difficult to rely on them to
understand that the state of the resource is.

In case of corev1.ConditionTrue, the reason can be used to
distinguish between:
- Successful
- Successful, some parts were skipped (pipelinerun only)

In case of corev1.ConditionFalse, the reason can be used to
distinguish between:
- Failed
- Failed because of timeout
- Failed because of cancelled by the user

In case of corev1.ConditionUnknown, the reason can be used to
distinguish between:
- Just started reconciling
- Validation done, running (or still running)
- Cancellation requested

This is implemented through the following changes:
- Bubble-up reasons for taskrun and pipelinerun to the
  v1beta1 API, except for reason that are defined by the
  underlying resource
- Enforce the start reason to be set during condition init

This allows for an additional change in the eventing module: the
condition before and after can be used to decide whether to send
an event at all. If they are different, the after condition now
contains enough information to send the event.

The cloudevent module is extended with ability to send the correct
event based on both status and reason.
  • Loading branch information
afrittoli committed Jun 8, 2020
1 parent 514b240 commit f3f418d
Show file tree
Hide file tree
Showing 21 changed files with 416 additions and 211 deletions.
66 changes: 66 additions & 0 deletions docs/pipelineruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ weight: 4
- [Specifying `Workspaces`](#specifying-workspaces)
- [Specifying `LimitRange` values](#specifying-limitrange-values)
- [Configuring a failure timeout](#configuring-a-failure-timeout)
- [Monitoring execution status](#monitoring-execution-status)
- [Cancelling a `PipelineRun`](#cancelling-a-pipelinerun)
- [Events](events.md#pipelineruns)

Expand Down Expand Up @@ -362,6 +363,71 @@ The `timeout` value is a `duration` conforming to Go's
values are `1h30m`, `1h`, `1m`, and `60s`. If you set the global timeout to 0, all `PipelineRuns`
that do not have an idividual timeout set will fail immediately upon encountering an error.

## Monitoring execution status

As your `PipelineRun` executes, its `status` field accumulates information on the execution of each `TaskRun`
as well as the `PipelineRun` as a whole. This information includes the name of the pipeline `Task` associated
to a `TaskRun`, the complete [status of the `TaskrRun`](taskruns.md#monitoring-execution-status) and details
about `Conditions` that may be associated to a `TaskRun`.

The following example shows an extract from the `status` field of a `PipelineRun` that has executed successfully:

```yaml
completionTime: "2020-05-04T02:19:14Z"
conditions:
- lastTransitionTime: "2020-05-04T02:19:14Z"
message: 'Tasks Completed: 4, Skipped: 0'
reason: Succeeded
status: "True"
type: Succeeded
startTime: "2020-05-04T02:00:11Z"
taskRuns:
triggers-release-nightly-frwmw-build-ng2qk:
pipelineTaskName: build
status:
completionTime: "2020-05-04T02:10:49Z"
conditions:
- lastTransitionTime: "2020-05-04T02:10:49Z"
message: All Steps have completed executing
reason: Succeeded
status: "True"
type: Succeeded
podName: triggers-release-nightly-frwmw-build-ng2qk-pod-8vj99
resourcesResult:
- key: commit
resourceRef:
name: git-source-triggers-frwmw
value: 9ab5a1234166a89db352afa28f499d596ebb48db
startTime: "2020-05-04T02:05:07Z"
steps:
- container: step-build
imageID: docker-pullable://golang@sha256:a90f2671330831830e229c3554ce118009681ef88af659cd98bfafd13d5594f9
name: build
terminated:
containerID: docker://6b6471f501f59dbb7849f5cdde200f4eeb64302b862a27af68821a7fb2c25860
exitCode: 0
finishedAt: "2020-05-04T02:10:45Z"
reason: Completed
startedAt: "2020-05-04T02:06:24Z"
```

The following tables shows how to read the overall status of a `PipelineRun`:

`status`|`reason`|`completionTime` is set|Description
:-------|:-------|:---------------------:|--------------:
Unknown|Started|No|The `PipelineRun` has just been picked up by the controller.
Unknown|Running|No|The `PipelineRun` has been validate and started to perform its work.
Unknown|PipelineRunCancelled|No|The user requested the PipelineRun to be cancelled. Cancellation has not be done yet.
True|Succeeded|Yes|The `PipelineRun` completed successfully.
True|Completed|Yes|The `PipelineRun` completed successfully, one or more Tasks were skipped.
False|Failed|Yes|The `PipelineRun` failed because one of the `TaskRuns` failed.
False|\[Error message\]|No|The `PipelineRun` encountered an non-permanent error, but it's still running and it may ultimately succeed.
False|\[Error message\]|Yes|The `PipelineRun` failed with a permanent error (usually validation).
False|PipelineRunCancelled|Yes|The `PipelineRun` was cancelled successfully.
False|PipelineRunTimeout|Yes|The `PipelineRun` timed out.

When a `PipelineRun` changes status, [events](events.md#pipelineruns) are triggered accordingly.

## Cancelling a `PipelineRun`

To cancel a `PipelineRun` that's currently executing, update its definition
Expand Down
21 changes: 19 additions & 2 deletions docs/taskruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ point for the `Pod` in which the container images specified in your `Task` will
customize the `Pod` configuration specifically for that `TaskRun`.

In the following example, the `Task` specifies a `volumeMount` (`my-cache`) object, also provided by the `TaskRun`,
using a `PersistentVolumeClaim` volume. A specific scheduler is also configured in the `SchedulerName` field.
using a `PersistentVolumeClaim` volume. A specific scheduler is also configured in the `SchedulerName` field.
The `Pod` executes with regular (non-root) user permissions.

```yaml
Expand Down Expand Up @@ -281,7 +281,7 @@ For more information, see [`ServiceAccount`](auth.md).
## Monitoring execution status

As your `TaskRun` executes, its `status` field accumulates information on the execution of each `Step`
as well as the `TaskRun` as a whole. This information includes start and stop times, exit codes, the
as well as the `TaskRun` as a whole. This information includes start and stop times, exit codes, the
fully-qualified name of the container image, and the corresponding digest.

**Note:** If any `Pods` have been [`OOMKilled`](https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/)
Expand Down Expand Up @@ -311,6 +311,23 @@ steps:
startedAt: "2019-08-12T18:22:54Z"
```

The following tables shows how to read the overall status of a `TaskRun`:

`status`|`reason`|`completionTime` is set|Description
:-------|:-------|:---------------------:|--------------:
Unknown|Started|No|The TaskRun has just been picked up by the controller.
Unknown|Pending|No|The TaskRun is waiting on a Pod in status Pending.
Unknown|Running|No|The TaskRun has been validate and started to perform its work.
Unknown|TaskRunCancelled|No|The user requested the TaskRun to be cancelled. Cancellation has not be done yet.
True|Succeeded|Yes|The TaskRun completed successfully.
False|Failed|Yes|The TaskRun failed because one of the steps failed.
False|\[Error message\]|No|The TaskRun encountered a non-permanent error, and it's still running. It may ultimately succeed.
False|\[Error message\]|Yes|The TaskRun failed with a permanent error (usually validation).
False|TaskRunCancelled|Yes|The TaskRun was cancelled successfully.
False|TaskRunTimeout|Yes|The TaskRun timed out.

When a `TaskRun` changes status, [events](events.md#taskruns) are triggered accordingly.

### Monitoring `Steps`

If multiple `Steps` are defined in the `Task` invoked by the `TaskRun`, you can monitor their execution
Expand Down
51 changes: 38 additions & 13 deletions pkg/apis/pipeline/v1beta1/pipelinerun_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,18 +73,8 @@ func (pr *PipelineRun) GetTaskRunRef() corev1.ObjectReference {
}
}

// GetTypeMeta returns the task run type meta
func (pr *PipelineRun) GetTypeMeta() *metav1.TypeMeta {
return &pr.TypeMeta
}

// GetObjectMeta returns the task run type meta
func (pr *PipelineRun) GetObjectMeta() *metav1.ObjectMeta {
return &pr.ObjectMeta
}

// GetStatus returns the task run status as a RunsToCompletionStatus
func (pr *PipelineRun) GetStatus() RunsToCompletionStatus {
// GetStatusCondition returns the task run status as a ConditionAccessor
func (pr *PipelineRun) GetStatusCondition() apis.ConditionAccessor {
return &pr.Status
}

Expand Down Expand Up @@ -221,6 +211,32 @@ type PipelineRunStatus struct {
PipelineRunStatusFields `json:",inline"`
}

// PipelineRunReason represents a reason for the pipeline run "Succeeded" condition
type PipelineRunReason string

const (
// PipelineRunReasonStarted is the reason set when the PipelineRun has just started
PipelineRunReasonStarted PipelineRunReason = "Started"
// PipelineRunReasonRunning is the reason set when the PipelineRun is running
PipelineRunReasonRunning PipelineRunReason = "Running"
// PipelineRunReasonSuccessful is the reason set when the PipelineRun completed successfully
PipelineRunReasonSuccessful PipelineRunReason = "Succeeded"
// PipelineRunReasonCompleted is the reason set when the PipelineRun completed successfully with one or more skipped Tasks
PipelineRunReasonCompleted PipelineRunReason = "Completed"
// PipelineRunReasonFailed is the reason set when the PipelineRun completed with a failure
PipelineRunReasonFailed PipelineRunReason = "Failed"
// PipelineRunReasonCancelled is the reason set when the PipelineRun cancelled by the user
// This reason may be found with a corev1.ConditionFalse status, if the cancellation was processed successfully
// This reason may be found with a corev1.ConditionUnknown status, if the cancellation is being processed or failed
PipelineRunReasonCancelled PipelineRunReason = "Cancelled"
// PipelineRunReasonTimedOut is the reason set when the PipelineRun has timed out
PipelineRunReasonTimedOut PipelineRunReason = "PipelineRunTimeout"
)

func (t PipelineRunReason) String() string {
return string(t)
}

var pipelineRunCondSet = apis.NewBatchConditionSet()

// GetCondition returns the Condition matching the given type.
Expand All @@ -231,13 +247,22 @@ func (pr *PipelineRunStatus) GetCondition(t apis.ConditionType) *apis.Condition
// InitializeConditions will set all conditions in pipelineRunCondSet to unknown for the PipelineRun
// and set the started time to the current time
func (pr *PipelineRunStatus) InitializeConditions() {
started := false
if pr.TaskRuns == nil {
pr.TaskRuns = make(map[string]*PipelineRunTaskRunStatus)
}
if pr.StartTime.IsZero() {
pr.StartTime = &metav1.Time{Time: time.Now()}
started = true
}
conditionManager := pipelineRunCondSet.Manage(pr)
conditionManager.InitializeConditions()
// Ensure the started reason is set for the "Succeeded" condition
if started {
initialCondition := conditionManager.GetCondition(apis.ConditionSucceeded)
initialCondition.Reason = PipelineRunReasonStarted.String()
conditionManager.SetCondition(*initialCondition)
}
pipelineRunCondSet.Manage(pr).InitializeConditions()
}

// SetCondition sets the condition, unsetting previous conditions with the same
Expand Down
19 changes: 18 additions & 1 deletion pkg/apis/pipeline/v1beta1/pipelinerun_types_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ func TestPipelineRun_TaskRunref(t *testing.T) {
}
}

func TestInitializeConditions(t *testing.T) {
func TestInitializePipelineRunConditions(t *testing.T) {
p := &v1beta1.PipelineRun{
ObjectMeta: metav1.ObjectMeta{
Name: "test-name",
Expand All @@ -100,12 +100,29 @@ func TestInitializeConditions(t *testing.T) {
t.Fatalf("PipelineRun StartTime not initialized correctly")
}

condition := p.Status.GetCondition(apis.ConditionSucceeded)
if condition.Reason != v1beta1.PipelineRunReasonStarted.String() {
t.Fatalf("PipelineRun initialize reason should be %s, got %s instead", v1beta1.PipelineRunReasonStarted.String(), condition.Reason)
}
p.Status.TaskRuns["fooTask"] = &v1beta1.PipelineRunTaskRunStatus{}

// Change the reason before we initialize again
p.Status.SetCondition(&apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
Reason: "not just started",
Message: "hello",
})

p.Status.InitializeConditions()
if len(p.Status.TaskRuns) != 1 {
t.Fatalf("PipelineRun status getting reset")
}

newCondition := p.Status.GetCondition(apis.ConditionSucceeded)
if newCondition.Reason != "not just started" {
t.Fatalf("PipelineRun initialize reset the condition reason to %s", newCondition.Reason)
}
}

func TestPipelineRunIsDone(t *testing.T) {
Expand Down
70 changes: 51 additions & 19 deletions pkg/apis/pipeline/v1beta1/taskrun_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,10 +72,6 @@ const (
// TaskRunSpecStatusCancelled indicates that the user wants to cancel the task,
// if not already cancelled or terminated
TaskRunSpecStatusCancelled = "TaskRunCancelled"

// TaskRunReasonCancelled indicates that the TaskRun has been cancelled
// because it was requested so by the user
TaskRunReasonCancelled = "TaskRunCancelled"
)

// TaskRunInputs holds the input values that this task was invoked with.
Expand All @@ -102,6 +98,43 @@ type TaskRunStatus struct {
TaskRunStatusFields `json:",inline"`
}

// TaskRunReason is an enum used to store all TaskRun reason for
// the Succeeded condition that are controlled by the TaskRun itself. Failure
// reasons that emerge from underlying resources are not included here
type TaskRunReason string

const (
// TaskRunReasonStarted is the reason set when the TaskRun has just started
TaskRunReasonStarted TaskRunReason = "Started"
// TaskRunReasonRunning is the reason set when the TaskRun is running
TaskRunReasonRunning TaskRunReason = "Running"
// TaskRunReasonSuccessful is the reason set when the TaskRun completed successfully
TaskRunReasonSuccessful TaskRunReason = "Succeeded"
// TaskRunReasonFailed is the reason set when the TaskRun completed with a failure
TaskRunReasonFailed TaskRunReason = "Failed"
// TaskRunReasonCancelled is the reason set when the Taskrun is cancelled by the user
TaskRunReasonCancelled TaskRunReason = "TaskRunCancelled"
// TaskRunReasonTimedOut is the reason set when the Taskrun has timed out
TaskRunReasonTimedOut TaskRunReason = "TaskRunTimeout"
)

func (t TaskRunReason) String() string {
return string(t)
}

// GetStartedReason returns the reason set to the "Succeeded" condition when
// InitializeConditions is invoked
func (trs *TaskRunStatus) GetStartedReason() string {
return TaskRunReasonStarted.String()
}

// GetRunningReason returns the reason set to the "Succeeded" condition when
// the RunsToCompletion starts running. This is used indicate that the resource
// could be validated is starting to perform its job.
func (trs *TaskRunStatus) GetRunningReason() string {
return TaskRunReasonRunning.String()
}

// MarkResourceNotConvertible adds a Warning-severity condition to the resource noting
// that it cannot be converted to a higher version.
func (trs *TaskRunStatus) MarkResourceNotConvertible(err *CannotConvertError) {
Expand All @@ -116,11 +149,11 @@ func (trs *TaskRunStatus) MarkResourceNotConvertible(err *CannotConvertError) {

// MarkResourceFailed sets the ConditionSucceeded condition to ConditionFalse
// based on an error that occurred and a reason
func (trs *TaskRunStatus) MarkResourceFailed(reason string, err error) {
func (trs *TaskRunStatus) MarkResourceFailed(reason TaskRunReason, err error) {
taskRunCondSet.Manage(trs).SetCondition(apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
Reason: reason,
Reason: reason.String(),
Message: err.Error(),
})
}
Expand Down Expand Up @@ -180,23 +213,13 @@ type TaskRunResult struct {
Value string `json:"value"`
}

// GetTypeMeta returns the task run type meta
func (tr *TaskRun) GetTypeMeta() *metav1.TypeMeta {
return &tr.TypeMeta
}

// GetObjectMeta returns the task run type meta
func (tr *TaskRun) GetObjectMeta() *metav1.ObjectMeta {
return &tr.ObjectMeta
}

// GetOwnerReference gets the task run as owner reference for any related objects
func (tr *TaskRun) GetOwnerReference() metav1.OwnerReference {
return *metav1.NewControllerRef(tr, taskRunGroupVersionKind)
}

// GetStatus returns the task run status as a RunsToCompletionStatus
func (tr *TaskRun) GetStatus() RunsToCompletionStatus {
// GetStatusCondition returns the task run status as a ConditionAccessor
func (tr *TaskRun) GetStatusCondition() apis.ConditionAccessor {
return &tr.Status
}

Expand All @@ -208,10 +231,19 @@ func (trs *TaskRunStatus) GetCondition(t apis.ConditionType) *apis.Condition {
// InitializeConditions will set all conditions in taskRunCondSet to unknown for the TaskRun
// and set the started time to the current time
func (trs *TaskRunStatus) InitializeConditions() {
started := false
if trs.StartTime.IsZero() {
trs.StartTime = &metav1.Time{Time: time.Now()}
started = true
}
conditionManager := taskRunCondSet.Manage(trs)
conditionManager.InitializeConditions()
// Ensure the started reason is set for the "Succeeded" condition
if started {
initialCondition := conditionManager.GetCondition(apis.ConditionSucceeded)
initialCondition.Reason = TaskRunReasonStarted.String()
conditionManager.SetCondition(*initialCondition)
}
taskRunCondSet.Manage(trs).InitializeConditions()
}

// SetCondition sets the condition, unsetting previous conditions with the same
Expand Down
34 changes: 34 additions & 0 deletions pkg/apis/pipeline/v1beta1/taskrun_types_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -340,3 +340,37 @@ func TestHasTimedOut(t *testing.T) {
})
}
}

func TestInitializeTaskRunConditions(t *testing.T) {
tr := &v1beta1.TaskRun{
ObjectMeta: metav1.ObjectMeta{
Name: "test-name",
Namespace: "test-ns",
},
}
tr.Status.InitializeConditions()

if tr.Status.StartTime.IsZero() {
t.Fatalf("TaskRun StartTime not initialized correctly")
}

condition := tr.Status.GetCondition(apis.ConditionSucceeded)
if condition.Reason != v1beta1.TaskRunReasonStarted.String() {
t.Fatalf("TaskRun initialize reason should be %s, got %s instead", v1beta1.TaskRunReasonStarted.String(), condition.Reason)
}

// Change the reason before we initialize again
tr.Status.SetCondition(&apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
Reason: "not just started",
Message: "hello",
})

tr.Status.InitializeConditions()

newCondition := tr.Status.GetCondition(apis.ConditionSucceeded)
if newCondition.Reason != "not just started" {
t.Fatalf("PipelineRun initialize reset the condition reason to %s", newCondition.Reason)
}
}
Loading

0 comments on commit f3f418d

Please sign in to comment.