allow users to add workflow states after a built-in executing action #3014

reubenmiller · 2024-07-24T13:49:06Z

Is your feature improvement request related to a problem? Please describe.

When overriding the config_update workflow which re-uses the existing thin-edge.io config_update functionality and only adds a pre and post state. The postprocess state is not currently not possible to add as the built-in executing state transitions automatically to successful ignoring whatever on_success is defined.

An example workflow is show below, where preprocess and postprocess have been added to perform some custom actions before letting the tedge-agent execute the built-in functionality.

#:schema https://gist.githubusercontent.com/reubenmiller/4e28e8403fe0c54b7461ac7d1d6838c2/raw/4ad7e3bc3ce2e3a7542a5b4873a0196cbe91f35d/tedge.workflow.json

operation = "config_update"
on_error = "failed"

[init]
  action = "proceed"
  on_success = "preprocess"

[preprocess]
  script = "/usr/bin/config_update.sh preprocess ${.payload.type}"
  on_success = "scheduled"
  # Proceed on any unexpected errors to ensure the operations status changed are handled
  on_error = "scheduled"

[scheduled]
  action = "builtin"
  on_success = "executing"

[executing]
  action = "builtin"
  on_success = "postprocess"

[postprocess]
  script = "/usr/bin/config_update.sh postprocess ${.payload.type}"
  on_success = "successful"

[successful]
 action = "cleanup"

[failed]
  action = "cleanup"

Below shows the workflow log after an execution, notice how the postprocess state is not present in the log (but the preprocess state is).

==================================================================
Triggered config_update workflow
==================================================================

topic:     te/device/main///cmd/config_update/c8y-mapper-38494494
operation: config_update
cmd_id:    c8y-mapper-38494494
time:      2024-07-24T13:43:33.22652579Z

==================================================================

----------------------[ config_update @ init | time=2024-07-24T13:43:33.228707959Z ]----------------------

State:    {"logPath":"/var/log/tedge/agent/workflow-config_update-c8y-mapper-38494494.log","remoteUrl":"http://127.0.0.1:8001/c8y/inventory/binaries/38462323","status":"init","type":"custom1"}

Action:   move to preprocess state

=> moving to config_update @ preprocess

----------------------[ config_update @ preprocess | time=2024-07-24T13:43:33.243623851Z ]----------------------

State:    {"logPath":"/var/log/tedge/agent/workflow-config_update-c8y-mapper-38494494.log","remoteUrl":"http://127.0.0.1:8001/c8y/inventory/binaries/38462323","status":"preprocess","type":"custom1"}

Action:   /usr/bin/config_update.sh preprocess custom1

Exit status: 0 (OK)

stderr <<EOF
Backing up file: 
EOF

stdout (EMPTY)
=> moving to config_update @ scheduled

----------------------[ config_update @ scheduled | time=2024-07-24T13:43:33.255045364Z ]----------------------

State:    {"logPath":"/var/log/tedge/agent/workflow-config_update-c8y-mapper-38494494.log","remoteUrl":"http://127.0.0.1:8001/c8y/inventory/binaries/38462323","status":"scheduled","type":"custom1"}

Action:   builtin


----------------------[ config_update @ executing | time=2024-07-24T13:43:33.265693542Z ]----------------------

State:    {"logPath":"/var/log/tedge/agent/workflow-config_update-c8y-mapper-38494494.log","remoteUrl":"http://127.0.0.1:8001/c8y/inventory/binaries/38462323","status":"executing","tedgeUrl":"http://127.0.0.1:8001/c8y/inventory/binaries/38462323","type":"custom1"}

Action:   builtin


----------------------[ config_update @ successful | time=2024-07-24T13:43:34.163506312Z ]----------------------

State:    {"logPath":"/var/log/tedge/agent/workflow-config_update-c8y-mapper-38494494.log","path":"/etc/custom1.toml","remoteUrl":"http://127.0.0.1:8001/c8y/inventory/binaries/38462323","status":"successful","tedgeUrl":"http://127.0.0.1:8001/c8y/inventory/binaries/38462323","type":"custom1"}

Action:   wait for the requester to finalize the command

Describe the solution you'd like

Built-in actions should honour the on_success fields defined in a given workflow.

Describe alternatives you've considered

Additional context

The text was updated successfully, but these errors were encountered:

didier-wenzek · 2024-08-19T16:14:42Z

The original proposal is to have the built-in actions honoring the on_success and on_error fields:

[scheduled]
action = "builtin"
on_success = "executing"

[executing]
action = "builtin"
on_success = "postprocess"
on_error = "rollback"

However, to be more consistent with the other actions - notably background scripts,
it would be better to distinguish the two scheduled and executing cases as for sub-operations actions.
When scheduled, the operation is only triggered and the final outcome is awaited in the executing state.
Hence, I propose that the built-in actions honor not only the on_success and on_error fields, but also the on_exec field.

[scheduled]
  action = "builtin"
  on_exec = "executing"

[executing]
  action = "builtin"
  on_success = "postprocess"
  on_error = "rollback"

I also propose to deprecate the confusing builtin keyword.
And, instead of implicitly triggering the action named after the operation,
let the user be explicit about the action:

[scheduled]
  action = "software_update"
  on_exec = "executing"

[executing]
  action = "await_software_update"
  on_success = "postprocess"
  on_error = "rollback"

Then, as a syntactic sugar, it will be convenient to squash the trigger and the await actions. As in:

[scheduled]
  action = "software_update"
  on_exec = "executing"
  on_success = "postprocess"
  on_error = "rollback"

didier-wenzek · 2024-08-20T15:10:38Z

Design

Three points have to be addressed:

Determining the builtin command and state
- Currently, using action = builtin, these are implied.
  The command is given by the operation of the workflow
  and the status is the current status of the command.
- With action = <operation-name> the operation is explicit
  and the status is derived from the workflow definition.
  - action = <operation-name> -> scheduled
  - action = <await-operation-name> -> executing
Forwarding the command to the appropriate actor.
- This is done by the CommandDispatcher::send() method which uses the command name as the dispatch key.
- With the current setting (i.e. action = builtin), the status can be sent unchanged;
  but with explicit operation-name (as in action = software_update or action = await_software_update)
  the status as to be set appropriately (either to scheduled or executing).
  Indeed, the user might use such an action in a custom state.
Determining the next state using the status returned by the actor to proceed with the appropriate exit handler
- status == executing -> on_exec handler
- status == successful -> on_success handler
- status == failed -> on_error handler

didier-wenzek · 2024-08-29T13:38:30Z

After discussion with @reubenmiller , we agreed on another more consistent approach to deprecate the confusing builtin action.

The key idea is to see the issue as a namespace issue:
we would like to distinguish user-defined workflows from built-in ones,
and, doing so, to allow users to invoke builtin workflows from their own workflow.

So the proposal is to distinguish builtin:operation-xyz> from operation-xyz
the later being a customized version of the former
(or being the same if no customized version has been provided by the user).

Triggering the builtin behavior will then be done using the operation keyword as for triggering a sub operation.

[scheduled]
  operation = "builtin:software_update"
  on_exec = "executing"

[executing]
  action = "await-operation-completion"
  on_success = "postprocess"
  on_error = "rollback"

Triggering the builtin version of an operation is done following the same rules as for sub operations:

In the workflow definition, the user can:
- set the input of the builtin operation, possibly using an input_script,
  the default being to pass the current state unchanged to the builtin workflow.
- inject the output of builtin operation into the current command state,
  the default being to merge the builtin outcome to the current command state.
- specify the next steps on_exec, on_success and on_error.
During execution:
- All the steps of the builtin workflow are logged in the main operation log file

For backward compatibility, any action = "builtin" definition
will be translated into the equivalent operation = "builtin:<operation-name>",
using the same current implicit resolution rules and raising an error if they don't apply.

albinsuresh · 2024-09-05T18:13:07Z

The last proposal definitely covers the stated requirement of adding post-processing states after a built-in operation execution. But we should also consider a future enhancement where users would want to override the builtin:software_update or builtin:config_update operations into more granular actions. For example, in the case of a config_update operation, the execution involves the following steps:

Send the executing status update
Download the config file
Apply the updated config file

In most cases, the users would want to reuse all those steps but sometimes add an additional state in between step 2 and step 3 to do some validation on the downloaded artefact before it is applied. In such cases, instead of a all-in-one builtin:config_update action, they'd want to break the workflow in granular states with granular actions like builtin:download, builtin:file-copy etc so that they can add additional logic between these actions.

The current proposal doesn't prevent the possibility to add more actions like that in the future, which is why I like this over the previous one tying actions to their corresponding states. builtin:config_update is just a named action that does everything for now, but we are free to add builtin:download and builtin:file-copy in future.

My only concern with this proposal is the decision to overload the same operation keyword for the same, as that might be a bit confusing to the user to differentiate between operation = "config_update" vs operation = "builtin:config_update" when they define their own composite workflows. Since the externally visible behaviour is also different for both (e.g: visibility over MQTT) that just adds to the confusion. So, I'd be in favour of a new keyword with input parameters similar to operation.

didier-wenzek · 2024-09-06T08:58:51Z

My only concern with this proposal is the decision to overload the same operation keyword for the same, as that might be a bit confusing to the user to differentiate between operation = "config_update" vs operation = "builtin:config_update" when they define their own composite workflows. Since the externally visible behaviour is also different for both (e.g: visibility over MQTT) that just adds to the confusion. So, I'd be in favour of a new keyword with input parameters similar to operation.

I see a difference between steps like download and file-copy compare to builtin:config_update.

It makes sense to trigger a download from any workflow.
Technically, a builtin:config_update could be triggered from any workflow, say device profile. But allowing this would be error prone: any customization for configuration updates will be ignored and this is not what the user might expect.

gligorisaev · 2024-09-13T05:39:09Z

Review of the Test Suite and Documentation done, tested also for flakiness, working as expected.

reubenmiller added improvement User value theme:workflows labels Jul 24, 2024

didier-wenzek self-assigned this Aug 12, 2024

didier-wenzek mentioned this issue Aug 20, 2024

feat: workflow builtin actions honoring on_success handlers #3070

Closed

19 tasks

didier-wenzek mentioned this issue Sep 4, 2024

Feat: improve workflow builtin actions #3105

Merged

20 tasks

didier-wenzek assigned gligorisaev and unassigned didier-wenzek Sep 12, 2024

reubenmiller added this to the 1.3.0 milestone Sep 13, 2024

reubenmiller closed this as completed Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow users to add workflow states after a built-in executing action #3014

allow users to add workflow states after a built-in executing action #3014

reubenmiller commented Jul 24, 2024

didier-wenzek commented Aug 19, 2024

didier-wenzek commented Aug 20, 2024

didier-wenzek commented Aug 29, 2024 •

edited

Loading

albinsuresh commented Sep 5, 2024

didier-wenzek commented Sep 6, 2024

gligorisaev commented Sep 13, 2024

allow users to add workflow states after a built-in executing action #3014

allow users to add workflow states after a built-in executing action #3014

Comments

reubenmiller commented Jul 24, 2024

didier-wenzek commented Aug 19, 2024

didier-wenzek commented Aug 20, 2024

Design

didier-wenzek commented Aug 29, 2024 • edited Loading

albinsuresh commented Sep 5, 2024

didier-wenzek commented Sep 6, 2024

gligorisaev commented Sep 13, 2024

didier-wenzek commented Aug 29, 2024 •

edited

Loading