Skip to content

Commit

Permalink
Merge pull request #150 from riga/feature/crab_workflows
Browse files Browse the repository at this point in the history
CMS crab workflows
  • Loading branch information
riga committed Aug 23, 2023
2 parents 04fedbf + 02d504c commit c7f39b3
Show file tree
Hide file tree
Showing 30 changed files with 2,573 additions and 230 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/lint_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python: ["27", "36", "37", "38", "39", "310"]
python: ["27", "36", "37", "38", "39", "310", "311"]
name: test (python ${{ matrix.python }})
steps:
- name: Checkout 🛎️
Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@


**Note**: This project is currently under development.
Version 1.0.0 will be the first, fully documented beta release, targetted for mid 2023.
Version 1.0.0 will be the first, fully documented beta release, targetted for fall 2023.

Use law to build complex and large-scale task workflows.
It is build on top of [luigi](https://github.com/spotify/luigi) and adds abstractions for **run locations**, **storage locations** and **software environments**.
Expand All @@ -29,7 +29,7 @@ Key features:
- Remote targets with automatic retries and local caching
- WebDAV, HTTP, Dropbox, SFTP, all WLCG protocols (srm, xrootd, dcap, gsiftp, webdav, ...)
- Automatic submission to batch systems from within tasks
- HTCondor, LSF, gLite, ARC, Slurm
- HTCondor, LSF, gLite, ARC, Slurm, CMS-CRAB
- Environment sandboxing, configurable on task level
- Docker, Singularity, Sub-Shells, Virutal envs

Expand Down Expand Up @@ -141,6 +141,7 @@ docker run -ti riga/law:example <example_name>
- [wlcg_targets](https://github.com/riga/law/tree/master/examples/wlcg_targets): Working with targets that are stored on WLCG storage elements (dCache, EOS, ...). TODO.
- [htcondor_at_vispa](https://github.com/riga/law/tree/master/examples/htcondor_at_vispa): HTCondor workflows at the [VISPA service](https://vispa.physik.rwth-aachen.de).
- [htcondor_at_cern](https://github.com/riga/law/tree/master/examples/htcondor_at_cern): HTCondor workflows at the CERN batch infrastructure.
- [CMS Crab at CERN](https://github.com/riga/law_example_CMSCrabWorkflows): CMS Crab workflows executed from lxplus at CERN.
- [sequential_htcondor_at_cern](https://github.com/riga/law/tree/master/examples/sequential_htcondor_at_cern): Continuation of the [htcondor_at_cern](https://github.com/riga/law/tree/master/examples/htcondor_at_cern) example, showing sequential jobs that eagerly start once jobs running previous requirements succeeded.
- [htcondor_at_naf](https://github.com/riga/law/tree/master/examples/htcondor_at_naf): HTCondor workflows at German [National Analysis Facility (NAF)](https://confluence.desy.de/display/IS/NAF+-+National+Analysis+Facility).
- [slurm_at_maxwell](https://github.com/riga/law/tree/master/examples/slurm_at_maxwell): Slurm workflows at the [Desy Maxwell cluster](https://confluence.desy.de/display/MXW/Maxwell+Cluster).
Expand Down
4 changes: 2 additions & 2 deletions examples/htcondor_at_cern/analysis/bootstrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,6 @@
# base tasks in analysis/framework.py.

action() {
source "{{analysis_path}}/setup.sh"
source "{{analysis_path}}/setup.sh" "$@"
}
action
action "$@"
82 changes: 82 additions & 0 deletions law.cfg.example
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
; - [singularity_sandbox]
; - [singularity_sandbox_env]
; - [singularity_sandbox_volumes]
; - [cmssw_sandbox]
; - [cmssw_sandbox_env]
;
; Please note that configuration options of law contrib packages might also appear in sections of
; the general law configuration.
Expand Down Expand Up @@ -905,6 +907,36 @@
; Type: integer
; Default: 25

; crab_job_file_dir
; crab_job_file_dir_mkdtemp
; crab_job_file_dir_cleanup
; Description: These three options are identical to the ones above without the "crab" prefix, but
; only apply to the "law.cms.CrabJobFileFactory". When "None" or not existing, the values above are
; used.

; crab_work_area
; Description: The directory in which the "law.cms.CrabJobManager" will create crab project
; directories upon submission. Defaults to "crab_job_file_dir".
; Type: string
; Default: job.crab_job_file_dir

; crab_sandbox_name
; Description: The name of the "cmssw" sandbox to use which provides the crab executable. In its
; simplest form, this is just the version string of the CMSSW release to use. However, additional
; settings can be appended with the "::" delimiter in the format
; "cmssw_version::setting=value::setting=value" for more configurablity. Supported settings are
; "setup", an additional setup script that is executed inside the src directory during installation,
; "dir", a custom install directory, "arch", a custom scram architecture, and "cores", the number of
; CPU cores to use for installation.
; Type: string
; Default: CMSSW_10_6_30

; crab_password_file
; Description: A file containing the X509 certificate password for automatic proxy delegations by
; the "law.crab.CrabJobManager".
; Type: string
; Default: None


; --- notifications section ------------------------------------------------------------------------

Expand Down Expand Up @@ -1238,3 +1270,53 @@
; represent host and container directory, respectively.
; The volumes defined in this section are applied to all singularity sandboxes. To configure volumes
; per image, create a section "singularity_sandbox_volumes_<image_name>" with the desired values.


; --- cmssw_sandbox section ------------------------------------------------------------------------

[cmssw_sandbox]

; Note:
; This section defines the default options for all cmssw sandboxes. To configure options per cmssw
; version, create a section "cmssw_sandbox_<cmssw_version>" with the desired options.


; stagein_dir_name
; Description: Name of a directory which is placed automatically inside a temporary directory and to
; which input files are staged-in outside of the sandbox and provided to the sandboxed task. When
; "None", no stage-in is performed and the task is assumed to be able to access inputs directly from
; within the sandbox.
; Type: string
; Default: "stagein"

; stageout_dir_name
; Description: Name of a directory which is placed automatically inside a temporary directory and
; provided to sandboxed tasks to store outputs and from which those files are staged-out outside the
; sandbox. When "None", no stage-out is performed and the task is assumed to be able to store
; outputs directly from within the sandbox.
; Type: string
; Default: "stageout"

; law_executable
; Description: The law executable to use within sandboxes, e.g. "law" or "python -m law".
; Type: string
; Default: "law"

; login
; Description: A boolean flag that decides whether the bash beneath the sandbox should be invoked as
; a login shell.
; Type: boolean
; Default: False


; --- cmssw_sandbox_env section --------------------------------------------------------------------

[cmssw_sandbox_env]

; Note:
; Here you can define environment variables via key-value pairs that are accessible in a cmssw
; sandbox. When an option has no value, i.e., when only a key is given, the value of the variable of
; the current environment is used.
; The environment variables defined in this section are applied to all cmssw sandboxes. To
; configure variables per cmssw version, create a section "cmssw_sandbox_env_<cmssw_version>" with
; the desired values.
11 changes: 11 additions & 0 deletions law/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,10 @@ def __str__(self):
"slurm_job_file_dir_cleanup": False,
"slurm_chunk_size_cancel": 25,
"slurm_chunk_size_query": 25,
"crab_job_file_dir": None,
"crab_job_file_dir_cleanup": False,
"crab_sandbox_name": "CMSSW_10_6_30",
"crab_password_file": None,
},
"notifications": {
"mail_recipient": None,
Expand Down Expand Up @@ -299,6 +303,13 @@ def __str__(self):
},
"singularity_sandbox_env": {},
"singularity_sandbox_volumes": {},
"cmssw_sandbox": {
"stagein_dir_name": "stagein",
"stageout_dir_name": "stageout",
"law_executable": "law",
"login": False,
},
"cmssw_sandbox_env": {},
}

_config_files = ["$LAW_CONFIG_FILE", "law.cfg", law_home_path("config"), "etc/law/config"]
Expand Down
7 changes: 3 additions & 4 deletions law/contrib/arc/job.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@

import os
import stat
import sys
import time
import re
import random
Expand Down Expand Up @@ -81,7 +80,7 @@ def submit(self, job_file, job_list=None, ce=None, retries=0, retry_delay=3, sil
# run the command
logger.debug("submit arc job(s) with command '{}'".format(cmd))
code, out, _ = interruptable_popen(cmd, shell=True, executable="/bin/bash",
stdout=subprocess.PIPE, stderr=sys.stderr, cwd=job_file_dir)
stdout=subprocess.PIPE, cwd=job_file_dir)

# in some cases, the return code is 0 but the ce did not respond valid job ids
job_ids = []
Expand Down Expand Up @@ -135,7 +134,7 @@ def cancel(self, job_id, job_list=None, silent=False):
# run it
logger.debug("cancel arc job(s) with command '{}'".format(cmd))
code, out, _ = interruptable_popen(cmd, shell=True, executable="/bin/bash",
stdout=subprocess.PIPE, stderr=sys.stderr)
stdout=subprocess.PIPE)

# check success
if code != 0 and not silent:
Expand Down Expand Up @@ -163,7 +162,7 @@ def cleanup(self, job_id, job_list=None, silent=False):
# run it
logger.debug("cleanup arc job(s) with command '{}'".format(cmd))
code, out, _ = interruptable_popen(cmd, shell=True, executable="/bin/bash",
stdout=subprocess.PIPE, stderr=sys.stderr)
stdout=subprocess.PIPE)

# check success
if code != 0 and not silent:
Expand Down
8 changes: 6 additions & 2 deletions law/contrib/arc/workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,9 +139,13 @@ def create_job_file(self, job_num, branches):
return {"job": job_file, "log": abs_log_file}

def destination_info(self):
info = ["ce: {}".format(",".join(self.task.arc_ce))]
info = super(ARCWorkflowProxy, self).destination_info()

info["ce"] = "ce: {}".format(",".join(self.task.arc_ce))

info = self.task.arc_destination_info(info)
return ", ".join(map(str, info))

return info


class ARCWorkflow(BaseRemoteWorkflow):
Expand Down
14 changes: 11 additions & 3 deletions law/contrib/cms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,18 @@
CMS-related contrib package. https://home.cern/about/experiments/cms
"""

__all__ = ["CMSJobDashboard", "BundleCMSSW", "Site", "lfn_to_pfn"]
__all__ = [
"CMSSWSandbox",
"CrabJobManager", "CrabJobFileFactory", "CMSJobDashboard",
"CrabWorkflow",
"BundleCMSSW",
"Site", "lfn_to_pfn", "renew_vomsproxy", "delegate_myproxy",
]


# provisioning imports
from law.contrib.cms.job import CMSJobDashboard
from law.contrib.cms.sandbox import CMSSWSandbox
from law.contrib.cms.job import CrabJobManager, CrabJobFileFactory, CMSJobDashboard
from law.contrib.cms.workflow import CrabWorkflow
from law.contrib.cms.tasks import BundleCMSSW
from law.contrib.cms.util import Site, lfn_to_pfn
from law.contrib.cms.util import Site, lfn_to_pfn, renew_vomsproxy, delegate_myproxy
31 changes: 31 additions & 0 deletions law/contrib/cms/crab/PSet.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# coding: utf-8

"""
Minimal valid configuration.
"""

import FWCore.ParameterSet.Config as cms


process = cms.Process("LAW")

process.source = cms.Source(
"PoolSource",
fileNames=cms.untracked.vstring([""]),
)

process.output = cms.OutputModule(
"PoolOutputModule",
fileName=cms.untracked.string("out.root"),
)

process.maxEvents = cms.untracked.PSet(
input=cms.untracked.int32(1),
)

process.options = cms.untracked.PSet(
allowUnscheduled=cms.untracked.bool(True),
wantSummary=cms.untracked.bool(False),
)

process.out = cms.EndPath(process.output)
Loading

0 comments on commit c7f39b3

Please sign in to comment.