Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI command for default config #164

Closed
StFroese opened this issue Sep 27, 2023 · 7 comments
Closed

CLI command for default config #164

StFroese opened this issue Sep 27, 2023 · 7 comments
Assignees
Labels

Comments

@StFroese
Copy link

Description

Would be nice to have a CLI command which creates a default config (or a basic structure) in the current directory, e.g. law quickstart

@riga
Copy link
Owner

riga commented Sep 28, 2023

Thanks for opening the request @StFroese !

Any suggestions what the default config should include? I was thinking about

  • Arguments
law quickstart
    --directory / -r DIR  ->  directory where template is created
    --no-tasks            ->  do not write dummy tasks
    --no-config           ->  do not write a dummy config
    --no-setup            ->  do not write a setup.sh file
  • Directory structure
DIR/
├─ law.cfg
├─ setup.sh
└─ tasks/
   ├─ __init__.py
   └─ tasks.py
  • law.cfg
; law configuration example
; for more info, see https://law.readthedocs.io/en/latest/config.html

[modules]
; the task modules that should be scanned by "law index"


[logging]
; log levels mapped to python modules
law: INFO
luigi-interface: INFO
gfal2: WARNING


[luigi_core]
; luigi core settings
local_scheduler: True
scheduler_host: 127.0.0.1
scheduler_port: 8080
parallel_scheduling: False
no_lock: True
log_level: INFO


[luigi_scheduler]
; luigi scheduler settings
record_task_history: False
remove_delay: 86400
retry_delay: 30
worker_disconnect_delay: 30


[luigi_worker]
; luigi worker settings
ping_interval: 20
wait_interval: 20
check_unfulfilled_deps: False
cache_task_completion: True
keep_alive: True
force_multiprocessing: False

@StFroese
Copy link
Author

Hi @riga, lgtm but maybe include the dummy task module in the config by default.
Another question I have: I haven't used law yet and I'm starting right now. Why are the tasks in a folder called tasks?
In luigi the combination of Tasks and Targets are called Workflows but here workflows are something else, right?

@StFroese
Copy link
Author

So what I mean is basically that it would make sense to me to have a file with different tasks and targets which depend on each other and build a workflow placed in a file called my_workflow.py inside a folder called workflows

@StFroese
Copy link
Author

StFroese commented Oct 2, 2023

I'd rather have the Workflow classes called TaskTree :)

@riga
Copy link
Owner

riga commented Oct 2, 2023

I'd rather have the Workflow classes called TaskTree :)

I see the appeal of that, but I fear at this point, a renaming would disrupt many existing setups. Also, the term "workflow" in luigi is defined rather loosely, so the overlap is perhaps smaller than you might think :)

So what I mean is basically that it would make sense to me to have a file with different tasks and targets which depend on each other and build a workflow placed in a file called my_workflow.py inside a folder called workflows.

Maybe the confusion resides here. The directory where tasks are located does not need to be called "tasks", nor do tasks have to be defined in just one directory or in just one repository. In the instance above, it was just an example, but we should perhaps call it "my_package" to convey that this directory should most likely used like a normal python package that happens to define tasks.

In this sense, with neither bare luigi or law, one never starts a "workflow" (in your definition of "workflow" above), but just a single task that, through its recursive dependencies, creates a task tree (yep, luigi salso calls it "task tree" sometimes 👀) represented by a DAG which is then processed by luigi. How this tree looks like is fully determined by how your tasks are defined, by runtime parameters or environment variables, and - of course - by what needs to be processed, i.e., which tasks were already complete in the first place. Therefore, at least to me, it doesn't make too much sense to call tasks in a directory a "workflow". For instance, depending on a parameter, the task tree / DAG might look completely different. But just because the tasks for that are defined at the same place, one still wouldn't consider both trees to be the same "workflow".

@pfackeldey
Copy link
Contributor

Regarding the law quickstart option, it may be more appropriate to have a cookie/copier-template (see: https://github.com/copier-org/copier). This would allow for more different types of setups (e.g. similar to the different law-examples), in case that might be needed at some point...

Apart from that, I like the simplicity of law quickstart :)

@riga riga closed this as completed in ce82c5f Oct 2, 2023
@StFroese
Copy link
Author

StFroese commented Oct 2, 2023

@riga thanks for the explanation, I thinks it's alright for me then. I guess no one stops me renaming my folder anyways haha :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants