[🐛 CI]: Jobs are not getting cancelled properly #13483

titusfortner · 2024-01-22T18:20:14Z

What happened?

Putting this in an issue since there is a lot of info and I need help. 😄

The original CI code was:

name: CI

on:
  pull_request:
  push:
    branches:
      - trunk
  schedule:
    - cron: "0 */12 * * *"
  workflow_dispatch:

  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: ${{ github.event_name == 'pull_request' }}

For some reason, this job that was manually triggered (with "workflow_dispatch") got the message: Canceling since a higher priority waiting request for 'CI-refs/heads/trunk' exists based on this job which was triggered by a push.

Since I wanted to run everything on trunk at that time, and the new job only ran JS tests based on our Bazel check job, I tried to redo the concurrency section.

The idea was to differentiate the groups between something that tested everything ("workflow_dispatch" and "schedule") from the ones that do not necessarily do that ("pull_request" and "push") by appending the string "-all" for the first two. Doing this required using a "fake ternary". The current code is:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}${{ github.event_name == 'workflow_dispatch' && '-all' || '' }}${{ github.event_name == 'schedule' && '-all' || '' }}
  cancel-in-progress: true

except now I'm running into the issue where this job that tests just the Python tests gets canceled because of this job that tests just the Ruby tests.

Looking at this again, I realize that the cancel-in-progress: ${{ github.event_name == 'pull_request' }} should have prevented the original issue I had, so now I'm a little stuck on what this code should be.

The goal:

Cancel jobs that re-run tests that have changed in the same branch with a subsequent job
Do not cancel jobs that are running tests that might not get run in a subsequent job

If we can trust bazel to cache properly, then maybe we don't need to worry about running more jobs and we shouldn't cancel things in progress? Or is caching not good enough? (there have been periods of active development where the CI was backed up about 4 hours which led me to believe we were redoing a lot of tests)

@p0deje / @diemol Any ideas?

The text was updated successfully, but these errors were encountered:

See #13483

p0deje · 2024-01-22T21:33:47Z

It should be fairly straightforward if we use the following group:

group: ${{ github.workflow }}-${{ github.ref }}-${{ github.actor }}-${{ github.event_name }}

Now if you run a job, it should not affect the builds that are started manually by other people (github.action is responsible for that) and different events (github.event_name).

titusfortner · 2024-01-22T22:00:07Z

Except if I push a Ruby update to trunk then push a JS update to trunk, the Ruby update will get stopped

And I'd be ok with a test run that tests everything stopping something else testing everything (hence combining dispatch & scheduled). But I guess I'd rather run more tests than fewer tests if it comes down to it.

p0deje · 2024-01-22T22:08:53Z

Except if I push a Ruby update to trunk then push a JS update to trunk, the Ruby update will get stopped

That should be handled by cancel-in-progress so in this case the Ruby update will have to finish first.

titusfortner · 2024-01-23T03:49:37Z

If it matches groups and cancel-in-progress is false, it does one at a time. If cancel-in-progress is true, it will stop the previous running one and start the new one. At least that's how I understand it, which is why I no longer understand why it wasn't working the first time.

diemol · 2024-01-23T09:18:48Z

Except if I push a Ruby update to trunk then push a JS update to trunk, the Ruby update will get stopped

That should be handled by cancel-in-progress so in this case the Ruby update will have to finish first.

This.

So, we should disable it, and run a build at a time. Which should be quick enough given bazel caching.

p0deje · 2024-01-23T16:35:19Z

At least that's how I understand it, which is why I no longer understand why it wasn't working the first time.

That's right and I am curious why it didn't work too.

titusfortner · 2024-01-23T17:55:52Z

I just removed everything until we know what we want — cc85374

I deleted everything because why should we only run one build at a time? Let's run whatever GitHub lets us run.

titusfortner · 2024-01-23T17:56:44Z

Or does that get in the way of how we are doing caching?

If 2 jobs need to use the same cache... I'm not sure how bazel + github manages that.

titusfortner · 2024-05-31T02:29:24Z

I'm closing this, we'll know if we want to make changes later, don't need this to track it

github-actions · 2024-06-30T22:48:41Z

This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.

titusfortner added I-defect C-build Build related issues (bazel and CI) labels Jan 22, 2024

titusfortner added a commit that referenced this issue Jan 22, 2024

[ci] deleting concurrency section until we figure out what we need

cc85374

See #13483

titusfortner closed this as completed May 31, 2024

github-actions bot locked and limited conversation to collaborators Jun 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[🐛 CI]: Jobs are not getting cancelled properly #13483

[🐛 CI]: Jobs are not getting cancelled properly #13483

titusfortner commented Jan 22, 2024 •

edited

Loading

p0deje commented Jan 22, 2024

titusfortner commented Jan 22, 2024

p0deje commented Jan 22, 2024 •

edited

Loading

titusfortner commented Jan 23, 2024

diemol commented Jan 23, 2024

p0deje commented Jan 23, 2024

titusfortner commented Jan 23, 2024

titusfortner commented Jan 23, 2024

titusfortner commented May 31, 2024

github-actions bot commented Jun 30, 2024

[🐛 CI]: Jobs are not getting cancelled properly #13483

[🐛 CI]: Jobs are not getting cancelled properly #13483

Comments

titusfortner commented Jan 22, 2024 • edited Loading

What happened?

p0deje commented Jan 22, 2024

titusfortner commented Jan 22, 2024

p0deje commented Jan 22, 2024 • edited Loading

titusfortner commented Jan 23, 2024

diemol commented Jan 23, 2024

p0deje commented Jan 23, 2024

titusfortner commented Jan 23, 2024

titusfortner commented Jan 23, 2024

titusfortner commented May 31, 2024

github-actions bot commented Jun 30, 2024

titusfortner commented Jan 22, 2024 •

edited

Loading

p0deje commented Jan 22, 2024 •

edited

Loading