Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New egs-parallel set of scripts to replace run_user_code_batch #628

Merged
merged 21 commits into from
Apr 12, 2021

Commits on Mar 26, 2021

  1. Configuration menu
    Copy the full SHA
    844c441 View commit details
    Browse the repository at this point in the history
  2. Improve egs-parallel scripts

    Save the egs-parallel log inside an *.egsparallel file in the
    application directory, and add a verbosity option (-v) to also echo the
    log to screen. By default the scripts proceed silently, unless an error
    condition arises, which is always echoed to the terminal.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    c694918 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    17aae81 View commit details
    Browse the repository at this point in the history
  4. Improve top-level egs-parallel script

    Notably, save log message to a log file, add a verbosity option (-v),
    and allow joined single-letter options and argument (without a space
    between the option and its argument, as in "-n123").
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    08d17f2 View commit details
    Browse the repository at this point in the history
  5. Improve egs-parallel sub-scripts

    Apart from format and other minor adjustments, update the standard pbs
    script egs-parallel-pbs (whereby EGSnrc jobs are submitted individually)
    so that only the second job waits for the .egsjob file and the .lock
    file, since the jobs are submitted sequentially anyways.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    996919b View commit details
    Browse the repository at this point in the history
  6. Add an egs-parallel sub-script for multicore cpus

    This egs-parallel-cpu subscript provides the option "--batch cpu" to
    egs-parallel, to launch a simulation on multiple cores on the local cpu,
    without requiring a job scheduler. Intentionally, this script is simple:
    it just launches the jobs sequentially, without waiting around for the
    .egsjob or .lock files, as in the pbs scripts. However, the logging is
    consistent with the other egs-parallel scripts
    
    The number of threads is always constrained to the number of threads
    available on the machine, because it is inefficient to go beyond that,
    and launching a large number of threads on a cpu by mistake may well
    stall the computer.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    24be243 View commit details
    Browse the repository at this point in the history
  7. Overhaul script to tidy up after egs-parallel runs

    Improve the script robustness, in particular by forcing the user to
    specify either the -n (--dry-run) option, or the -f (--force) option to
    actually remove files, to prevent accidental erasing (to some extent).
    This script removes files without warnings (when using -f), so use with
    caution: run with the -n option first to see what will be deleted.
    
    Add the concatenation and sorting of egs-parallel log messages into the
    .egsparallel file for reference. This is useful, because these log
    messages may be scattered in different files, for example the .eo files
    from pbs. After cleaning, the .egsparallel contains a time-ordered
    sequence of messages from egs-parallel and its subscripts.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    bc4ebc7 View commit details
    Browse the repository at this point in the history
  8. Change ncore to nthread in egs-parallel scripts

    Strictly speaking, there can be multiple threads per hardwarde core;
    this is typical in modern workstations. Change "ncore" to "nthread"
    throughout the egs-parallel scripts, to avoid confusion.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    e89a4a1 View commit details
    Browse the repository at this point in the history
  9. Add HEN_HOUSE/scripts/bin directory, add to path

    Add a bin directory in HEN_HOUSE/scripts and add it to the PATH in the
    shell additions scripts. This allows some EGSnrc scripts to be directly
    executable by a user, without using aliases (which are not inherited by
    subshells). The immediate motivation is for the top-level egs-parallel
    script, and the egs-parallel-clean script, to become visible on the
    path, while the egs-parallel sub-scripts remain in scripts and are not
    in the path (these should not be invoked directly).
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    5ccfdda View commit details
    Browse the repository at this point in the history
  10. Remove shell additions sourcing in egs-parallel

    Do not source the shell additions scripts from within the egs-parallel
    sub-scripts, as this is not necessary and not secure. Sourcing was only
    needed in the dshtask script to get the path to the EGSnrc executables,
    because tasks are launched on the pbs nodes without inheriting the
    environment. In this case, simply export the PATH variable via the
    pbsdsh qsub script.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    73cead2 View commit details
    Browse the repository at this point in the history
  11. Tweak timetamp and usage in egs-parallel scripts

    Use a more portable date command format for the timestamp string, and
    tweak the usage message of egs-parallel scripts.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    8b82463 View commit details
    Browse the repository at this point in the history
  12. Add -x and -v options to egs-parallel-clean

    Add -x (--extra) option to clean up egs-parallel log files .egsparallel
    and .egsparallel.eo. Although this script always echoes progress to the
    terminal, add a -v (--verbose) option to echo the commands that are run
    by the script, instead of the more concise messages usually reported.
    Internally, add an "action" command to ensure that the log messages
    remain up to date with the commands.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    19393ee View commit details
    Browse the repository at this point in the history
  13. Add -l (--list) option to egs-parallel-clean

    For convenience, add a -l (--list) option to the cleaning script to list
    all the .egslog file base names in the current directory. This option is
    checked first and overrides every other argument: the list is printed to
    the terminal and the script terminates. Also, reformat the usage message
    and use the extension .egsparallel-eo (with a hyphen) to avoid collision
    with the pbs .eo extension. Use executable basename in quit function.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    20fc34b View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    c48b343 View commit details
    Browse the repository at this point in the history
  15. Change default batch system to cpu in egs-parallel

    Change the initial value of the --batch option to "cpu" so that the
    script invokes the multicore parallel sub-script (egs-parallel-cpu) when
    no --batch option is specified on the command line. This allows users to
    try egs-parallel out of the box (most computers are multicore nowadays)
    without worrying about schedulers.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    ad2d8f9 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    43137ff View commit details
    Browse the repository at this point in the history
  17. Remove dependencies on lock file in egs-parallel

    Don't quit the egs-parallel submit scripts if no lock file is found, and
    add a -f (--force) option to override existing .egsjob or .lock files.
    
    The lock file for parallel jobs is managed inside EGSnrc, so the script
    should not manage it as well: this creates an obscure correlation
    between the code and the script. Moreover, the uniform run control
    method does no create a lock file. Previously, the submit script would
    quit if there was no lock file. The top-level egs-parallel script now
    prevents the run if there is an .egsjob file OR a .lock file, for the
    same reason. This can be overridden with the added --force option.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    c9ff999 View commit details
    Browse the repository at this point in the history
  18. Detect failure to launch pbs job in egs-parallel

    Detect pbs jobs that fail to launch in egs-parallel, by looking at the
    echoed job pid: quit immediately if it is not an integer. If the first
    job fails, subsequent jobs are not launched. Report the failure in the
    log. Also adjust the format of a few log messages.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    58b6cd2 View commit details
    Browse the repository at this point in the history
  19. Fix pbsdsh jobnames starting with a period

    Fix a crash that occurred when the 14 character truncation of the
    filename for an egs-parallel pbsdsh job ended up starting with a '.'.
    The first character is now trimmed away if that is the case, so the job
    name is only 13 characters.
    rtownson authored and ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    8df4716 View commit details
    Browse the repository at this point in the history
  20. Strip non-alphanumeric lead chars in PBS job name

    Ensure that the PBS job name starts with an alphanumeric character
    [0-9A-Za-z], following the PBS scheduler requirement. To avoid failed
    jobs solely on the account of a bad job name, strip all leading
    non-alphanumeric characters from the job name. Note that the egsinp
    basename is not affected, this is strictly for the job name passed to
    qsub via the -N option.
    ftessier committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    8a7311f View commit details
    Browse the repository at this point in the history

Commits on Mar 31, 2021

  1. Configuration menu
    Copy the full SHA
    01b86ac View commit details
    Browse the repository at this point in the history