Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WX-927 GCP Batch: LogsPolicy is now configurable #7491

Merged
merged 2 commits into from
Aug 29, 2024

Conversation

AlexITC
Copy link
Collaborator

@AlexITC AlexITC commented Aug 10, 2024

Description

LogsPolicy is now configurable

  • When the "logs-policy" config entry is missing, "CLOUD_LOGGING" is set which is the default policy from Batch.
  • When the "logs-policy" is set to "PATH", a "task.log" file is stored within the file system under the task files, this is later pushed to Google Cloud Storage.

This is an exampel where "task.log" can be found:

  • gs://project-id/cromwell-execution-root/workflow-name/workflow-id/call-myTask/task.log (workflow-id would be a UUID).

Release Notes Confirmation

CHANGELOG.md

  • I updated CHANGELOG.md in this PR
  • I assert that this change shouldn't be included in CHANGELOG.md because it doesn't impact community users

Terra Release Notes

  • I added a suggested release notes entry in this Jira ticket
  • I assert that this change doesn't need Jira release notes because it doesn't impact Terra users

- When the "logs-policy" config entry is missing, "CLOUD_LOGGING" is set which is the default policy from Batch.
- When the "logs-policy" is set to "PATH", a "task.log" file is stored within the file system under the task files, this is later pushed to Google Cloud Storage.

This is an exampel where "task.log" can be found:
- gs://project-id/cromwell-execution-root/workflow-name/workflow-id/call-myTask/task.log (workflow-id would be a UUID).
@AlexITC AlexITC requested a review from a team as a code owner August 10, 2024 14:14
@mcovarr
Copy link
Contributor

mcovarr commented Aug 13, 2024

set it to "PATH" to save the logs into the the mounted disk, at the end, this log file gets copied to the google cloud storage bucket with "task.log" as the name.

Do the task logs really only get copied to the storage bucket after the job completes? Cloud Life Sciences asynchronously copies task logs to GCS periodically while the job is running. This is the main utility of the task logs in that they allow users to see how jobs are progressing before they terminate.

@AlexITC
Copy link
Collaborator Author

AlexITC commented Aug 13, 2024

Do the task logs really only get copied to the storage bucket after the job completes?

Unfortunately, yes.

In theory, Google can push these logs directly to the Cloud Storage but we were told that the docs are wrong and that feature does not work (see https://cloud.google.com/php/docs/reference/cloud-batch/latest/V1.LogsPolicy).

logs_path: The path to which logs are saved when the destination = PATH. This can be a local file path on the VM, or under the mount point of a Persistent Disk or Filestore, or a Cloud Storage path.

@dspeck1
Copy link
Collaborator

dspeck1 commented Aug 13, 2024

Logs are seen as job progresses with the CLOUD_LOGGING option.

@mcovarr
Copy link
Contributor

mcovarr commented Aug 13, 2024

Logs are seen as job progresses with the CLOUD_LOGGING option.

Thank you @dspeck1, I was able to see cloud task logs for my "Hello world!" workflow in realtime when filtering for batch_task_logs.

In production many Cromwell users scatter thousands of jobs simultaneously per project. Would it be possible to have these cloud logs labeled with workflow id / root workflow id / task / shard / attempt so users can search for logs specific to a Cromwell job? I see there are some GCP Batch job ID-based labels but I don't know how to associate these to jobs in Cromwell / WDL terms.

@dspeck1
Copy link
Collaborator

dspeck1 commented Aug 13, 2024

We add the workflow ID as a label. Are task and shared generated by Cromwell and would they be available from the parameters or somewhere else?

@mcovarr
Copy link
Contributor

mcovarr commented Aug 13, 2024

We add the workflow ID as a label. Are task and shared generated by Cromwell and would they be available from the parameters or somewhere else?

Interesting, I'm running a local build from current develop and I seem to have the code you've linked above, but I don't see either the "cromwell-workflow-id" or "goog-batch-worker" labels on my task logs 🤔.

In .labels I have hostname, job_uid, task_group_name, and task_id keys.
In .resource.labels I have job_id, location, and resource_container keys.

For the proposed additional labels, with respect to GcpBatchRequestFactoryImpl#createAllocationPolicy:

  • Root workflow id is in data.createParameters.jobDescriptor.workflowDescriptor.rootWorkflowId.
  • Everything else is in the BackendJobDescriptorKey via data.createParameters.jobDescriptor.key:
    • task name in call.identifier.localName (I think)
    • shard in index
    • attempt in attempt

@dspeck1
Copy link
Collaborator

dspeck1 commented Aug 13, 2024

Oh those labels are not propagated from Batch to Logging. Less than ideal, but you can filter with gcloud. Example below.

gcloud batch jobs list --location us-west2 --filter="allocationPolicy.labels.cromwell-workflow-id=cromwell-9a2c2821-0856-49d
3-842c-2ffccc2ca8ac"
NAME: projects/batch-testing-350715/locations/us-west2/jobs/job-480c07d3-0a83-48de-b40e-51fbca760d0b
LOCATION: us-west2
STATE: SUCCEEDED

Then do a describe on the job. I will ask Google if there is something to propagate additional labels to logs.

@mcovarr
Copy link
Contributor

mcovarr commented Aug 13, 2024

I will ask Google if there is something to propagate additional labels to logs.

Thank you! This would be a big help in making the GCP Batch backend user-friendly at scale.

@aednichols aednichols changed the title GCP Batch: LogsPolicy is now configurable WX-927 GCP Batch: LogsPolicy is now configurable Aug 16, 2024
@AlexITC AlexITC merged commit 7ac101f into develop Aug 29, 2024
37 checks passed
@AlexITC AlexITC deleted the gcp-batch-logs-policy-config branch August 29, 2024 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants