Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limited parallelism with many modules #46

Open
danielsoutar opened this issue Feb 7, 2022 · 1 comment
Open

Limited parallelism with many modules #46

danielsoutar opened this issue Feb 7, 2022 · 1 comment

Comments

@danielsoutar
Copy link

danielsoutar commented Feb 7, 2022

As a user, I might want to split my tests up into multiple files, corresponding to separate modules. I would do this for separation of concerns, better organization, and reduce polluting the global namespace (e.g. with shared data or set-up/tear-down associated with several specific testsets). However, ReTest discourages this practice due to its current model of parallelism.

Consider the following example:

# runtests.jl

using Distributed

N_WORKERS = 2

addprocs(N_WORKERS, exeflags="--project=$(Base.active_project())")

# Propagate importing ReTest across all workers
@everywhere using ReTest

# Assume that each module_x.jl defines a `ModuleX` in the global namespace.
@everywhere include("path/to/module_1.jl")
@everywhere include("path/to/module_2.jl")
...
@everywhere include("path/to/module_M.jl")

# Call ReTest from the main process, and dispatch the workers across the testsets.
retest(Module1, Module2, ..., ModuleM; ...)

If M=100, and each ModuleX defined a single testset, the user would reasonably assume that ReTest would achieve reasonable parallelism and approximately halve the time taken due to having 2 workers. This is not the case, as ReTest only dispatches the workers across the testsets of a single module. This then means that enabling parallelism would have absolutely no effect at all, and due to overhead would simply be slower. On top of that, shuffling would also have no effect, as it also only works per-module, and so any hidden dependencies between modules would never be detected. This code block in src/ReTest.jl in the source confirms this:

for imod in eachindex(modules)
        mod, pat = modules[imod]
        tests = alltests[imod]
        ...
        shuffle &&
            shuffle!(tests)
        ...
        # TODO: move printer task out of worker?
        worker = @task begin
        ...

While this is fine and can be worked around (e.g. consolidating tests into a single file), this seems like an unnecessary restriction that the library imposes. Instead, it would be better if all of the testsets across all modules were gathered into a single global set that can then be divided across the workers. In this model of parallelism, the previewer task would work in a similar way to as it does currently, going through the testsets in the order that they appear to retest. Instead of having the workers follow in lockstep, they would simply return the results to the previewer process as and when they complete their testsets. When the previewer reaches a testset and the result is there, it immediately evaluates it and displays it. If it doesn't, it waits on the result until one of the workers completes it.

This has several advantages:

  • It improves parallelism in the case of many test modules and files
  • Parallel performance and good design through separation of concerns are no longer mutually exclusive goals
  • Shuffling can occur across modules as opposed to only per-module, greatly increasing the number of possible combinations
  • More opportunities for optimizing load-balancing are available, assuming the number of modules > 1
  • Developers that use a pattern like a single include("path/to/all_my_tests.jl") can continue to do so

I'd be happy to contribute towards this, as this seems like a significant improvement in flexibility for the developer and would go a long way towards making this a package that can be used for production.

@rfourquet
Copy link
Collaborator

Thanks for opening this issue, these are all very good points! I haven't these limitations myself yet, so your feedback coming from real usecase is very useful. I will try to think about it in the upcoming weeks and discuss with you here on a possible path forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants