Refactor tests #131

hyanwong · 2021-01-24T23:57:10Z

This is only a draft of what can be done, so let me know if it looks right @petrelharp, and I can try rolling it out to the other tests too. Only the test_tree_sequence.py::TestSlimTime test has been properly altered to use the basic_slim_recipe fixture, but it does work entirely in parallel, both over the tests and over the recipes, e.g. if you do

pytest test_tree_sequence.py::TestSlimTime -n 4

for all the recipes, or just

pytest test_tree_sequence.py::TestSlimTime -k nonWF_recipe

for one of them. Annoyingly the way we decided to do it means that we call the test function for every script, but simply do nothing if the script doesn't need this test running on it. I suspect that means the total count of tests run will be over inflated . Perhaps there's a pytest way to decide, programmatically within the test, to say "skip this test and don't count it in the totals" (pytest-dev/pytest#3730 seems relevant)?

The general setup is mostly now done in conftest.py, and similar to what I did in SLiM. the slim_outfiles() function defines the file names, SLiM, variable names, and a function to apply if the file is produced. Instead of the tree files having the name of the recipe, by default they are all called out.trees, but they are stored in a directory named by the recipe instead. I pass 2 SLiM variable names in, one for the trees file and another for the pedigree, but the second isn't always used by the script (which is fine, I would have thought).

There's a FIXME because I'm not sure where in the code we should normally set the random seed if these tests are all parallelized.

The only fixture so far is basic_slim_recipe. I intend to have another one for restarted_slim_recipe, but I haven't yet worked out how I'll do the restarting thing, as it will rely on the previous recipe having been run.

Also, I have had to change the names of all the recipes to put "recipe" as a suffix, rather than a prefix (i.e. nonWF_recipe rather than, as previously, recipe_nonWF), because the -k flag is always a partial matching thing, so that if you do -k recipe_nonWF it will run both recipe_nonWF and recipe_nonWF_early. I hope the name change is OK.

hyanwong · 2021-01-25T15:36:07Z

Perhaps there's a pytest way to decide, programmatically within the test, to say "skip this test and don't count it in the totals" (pytest-dev/pytest#3730 seems relevant)?

Actually, I've just found a much nicer way to do all this, using @pytest.mark.parametrize to restrict which recipes we run the tests on:

class TestHasIndividualParents(tests.PyslimTestCase):

    pedigree_recipes = [k for k, v in basic_slim_recipe_specs.items() if "pedigree" in v]

    @pytest.mark.parametrize('basic_slim_recipe', pedigree_recipes, indirect=True)
    def test_pedigree_has_parents(self, basic_slim_recipe):
         # this will be called only for recipes listed with {'pedigree': True}
         ts = basic_slim_recipe["ts"]
         # ts is a ts with an associated pedigree file

petrelharp · 2021-01-25T15:58:16Z

recipe_nonWF_early.trees.pedigree

@@ -0,0 +1,229 @@
+generation	stage	individual	age	parent1	parent2


let's not commit these temporary files produced by tests!

Oops. Well spotted. Must have been left from a previous trial. Sorry!

petrelharp · 2021-01-25T15:59:21Z

tests/README.md

@@ -0,0 +1,25 @@
+Install the dependencies using
+
+`pip install -r requirements/CI/requirements.txt`


if you're saying python3 below don't you need to also say pip3?

Ah, possibly. No harm in that. Will change.

petrelharp · 2021-01-25T16:00:09Z

tests/README.md

+To restrict to specific recipes in the examples dir, the syntax is:
+
+```
+python3 -m pytest -k nonWF_recipe


very nice. But, what is nonWF_recipe? There aren't any example slim scripts named this?

Seems to be to me: https://github.com/tskit-dev/pyslim/pull/131/files#diff-5f283adae7de0cdcc50a5773f9873d8b62ec0d5e0313f6bdbda6ec4157671dbd

I guess it's obvious that these can be combined?

python3 -m pytest tests/test_tree_sequence.py::test_slim_generation -k nonWF_recipe

Ah, ok. But, that's recipe_nonWF, not nonWF_recipe.

oh wait maybe you renamed all the recipes?

What's the motivation for that? I thought you were removing the repetitive recipe_ at the start... but it's at the end, now?

The -k flag matches its arg anywhere in the test string. Since there were some recipes named recipe_nonWF.slim and recipe_nonWF_early.slim, then running pytest -k recipe_nonWF would run both. If you put _recipe on the end then that serves as a useful terminator to stop this.

I guess the other way would be to require the .slim prefix to be passed: pytest -k recipe_nonWF.slim, and then I would have to cut off the .slim before checking for the settings in slim_recipe_specs.py. Perhaps you prefer this? It's a bit more fiddly though.

Hmm, I forgot that the prefix isn't unique either. So we have WF_early_recipe and nonWF_early_recipe, both of which will now match with -k WF_early_recipe. Argh. Looks like we need recipe_ back on the front and .slim on the end too.

arg, that's fiddly. But it makes it a bit more clear to have the full file name, IMO.

petrelharp · 2021-01-25T16:19:11Z

tests/slim_recipe_specs.py

+# These SLiM scripts read in an existing trees file; the "input" gives a .trees files produced
+# by recipes above that is appropriate for starting this script.
+restarted_slim_recipe_specs = {
+    "nucleotides_restart":   {"WF": True, "nucleotides": True, "no_op": True, "input": f"tests/examples/recipe_nucleotides.trees"},


this input looks different than the others

Could be. I haven't tested this yet. I just want to get some feedback as to whether the general approach was OK.

petrelharp · 2021-01-25T16:37:46Z

tests/conftest.py

+                os.makedirs(out_dir, exist_ok=True)
+                outfiles = run_slim(recipe=recipe_name, out_dir=out_dir)
+
+    ret = {o.key: o.post_process(o.path) for o in outfiles if os.path.isfile(o.path)}


very slick!

petrelharp · 2021-01-25T16:46:26Z

Gee, this is great! Looks good. (And, sorry I got confused about the file names before I read your whole comment. :sheepish:)

There's a FIXME because I'm not sure where in the code we should normally set the random seed if these tests are all parallelized.

I'm not sure either.

No other concerns! I do wonder if passing in the TREES_FILE and PEDIGREE_FILE is unnecessary complexity, since we could just pass in OUTDIR instead, but no big deal.

hyanwong · 2021-01-25T16:53:38Z

Gee, this is great! Looks good. (And, sorry I got confused about the file names before I read your whole comment. :sheepish:)

No probs. I'll try to get the whole thing working then.

There's a FIXME because I'm not sure where in the code we should normally set the random seed if these tests are all parallelized.

I'm not sure either.

I'll leave it as a FIXME for the time being then, and maybe ask Ben.

No other concerns! I do wonder if passing in the TREES_FILE and PEDIGREE_FILE is unnecessary complexity, since we could just pass in OUTDIR instead, but no big deal.

If we explicitly pass the full file names, we only define them in one place. Otherwise we need to decide on a filename (e.g. out.trees) in SLiM, and redefine the same names in the python code too (and make sure they keep in sync). And we would also need to do the filename munging in SLiM. And the code to create the dir+filename in SLiM would be repeated in all the recipes, so I suspect it would result in more code duplication.

petrelharp · 2021-01-25T17:34:29Z

Oh - about the seed - I think I do know the answer. The test set-up doesn't do any random stuff, so it doesn't need a seed set. And all the randomness we do in tests should have the seed specified, like

def test_whatever( ):
  random.seed(1234)
  a = random.sample(x)
  ...

... as we do in tskit. So no need to put it there. (It's possible that some of my tests don't set hte seed, but they should.)

hyanwong · 2021-01-26T01:31:55Z

Hoorah! The tests all seem to be passing now. You can fully parallelise all the tests & recipes now, SLiM simulation output is now all saved to tmp dirs (but only run once each, even when multi-threaded, I hope), and there's a slick way to call the restarted tests while making sure that the sims on which they depend have been run:

    @pytest.mark.parametrize(
        'restart_name, basic_recipe', restarted_recipe_eq("no_op"), indirect=["basic_recipe"])

Then use the basic_recipe["ts"] which you are now sure is available, and run SLiM on the output. The fancy stuff is all done in restarted_recipe_eq() at the bottom of recipe_specs.py. I can explain how it works if it's not clear.

Weirdly the restart_and_run_WF.slim and restart_and_run_nonWF.slim seem not to have slim recipes in tests/examples, so I've commented them out for the time being.

The pytest command will also now work from whatever dir you want to run it in.

Ready for final review, I reckon @petrelharp . Then I can try to move on to the "retain"ing PR.

hyanwong · 2021-01-26T01:34:21Z

P.s. there's not much left in __init__.py - it's mostly been moved to conftest.py. In fact, some of the stuff in __init__.py like assertTablesEqual can probably be swapped for functions which are now built in to tskit (I think?)

codecov-io · 2021-01-26T01:48:11Z

Codecov Report

Merging #131 (b52d1c4) into main (04ab7ea) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #131   +/-   ##
=======================================
  Coverage   89.39%   89.39%           
=======================================
  Files           5        5           
  Lines         886      886           
  Branches      166      166           
=======================================
  Hits          792      792           
  Misses         64       64           
  Partials       30       30

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 04ab7ea...b52d1c4. Read the comment docs.

tests/examples/recipe_nonWF.slim

tests/recipe_specs.py

tests/test_annotation.py

petrelharp · 2021-01-26T06:51:24Z

tests/test_annotation.py

-
-    def test_load_without_provenance(self):
+    @pytest.mark.parametrize(
+        'restart_name, basic_recipe', restarted_recipe_eq("no_op"), indirect=["basic_recipe"])


So, what's indirect do? Makes it cached? The docs are kinda cryptic...

I think it feeds the parameter value through the fixture function. So passing ("restart_WF.slim", "recipe_WF.slim") to "restart_name, basic_recipe", ..., indirect=["basic_recipe"] will call basic_recipe("recipe_WF.slim") and return that as the basic_recipe value in the test, whereas the restart_name parameter in the test function just contains the plain value "restart_WF.slim"

oh!! gotcha.

tests/test_metadata.py

petrelharp · 2021-01-26T08:02:22Z

Wow, that was a lot of work! And, it works!! Thanks for the wonderful solutions, not to mention the cleaning up after me in a few places. I have been over this with a fairly coarse comb, and have muddled through these fixture things enough to convince myself it's doing all the stuff it did before. (although I may need you to explain how it all works sometime soon, while it's fresh...) I only found two very minor changes in tests/examples/recipe_nonWF.slim, and one multiply-defined test; otherwise I'm happy to merge.

(And, I've done a diff of the the tests run before and after this PR, and other than that multiply-defined test, they're all being run!)

hyanwong · 2021-01-26T10:27:31Z

Wow, that was a lot of work! And, it works!!

🎉

I may need you to explain how it all works sometime soon, while it's fresh...

Yeah, it's not immediately obvious, sadly. But I think that's OK, the intention is clear to a reader.

I only found two very minor changes in tests/examples/recipe_nonWF.slim, and one multiply-defined test; otherwise I'm happy to merge.

Done. Merge away!

(And, I've done a diff of the the tests run before and after this PR, and other than that multiply-defined test, they're all being run!)

Great. How do you do that, by the way?

petrelharp · 2021-01-26T16:04:42Z

Yay!

I did

python3 -m pytest tests -n 1 -v &> temp_old
git checkout refactor-tests
python3 -m pytest tests -n 1 -v &> temp_new
diff <(cat temp_old | grep "^tests" | sort | uniq) <(cat temp_new | grep "^tests" | sed -e 's/\[.*\]//' | sort | uniq)

It's not checking whether the tests are being run on exactly the same scripts, but all the tests are being run, and none are being run on the wrong scripts, so I think it's all good.

Ignore OS X desktop files

9b86fc3

hyanwong force-pushed the refactor-tests branch 2 times, most recently from 0585a30 to 8b9bcc1 Compare January 25, 2021 14:40

petrelharp reviewed Jan 25, 2021

View reviewed changes

hyanwong force-pushed the refactor-tests branch 4 times, most recently from ed88aca to cf8d4db Compare January 26, 2021 01:24

hyanwong marked this pull request as ready for review January 26, 2021 01:24

hyanwong force-pushed the refactor-tests branch from cf8d4db to 6ba3ecd Compare January 26, 2021 01:31

petrelharp reviewed Jan 26, 2021

View reviewed changes

tests/examples/recipe_nonWF.slim Outdated Show resolved Hide resolved

petrelharp reviewed Jan 26, 2021

View reviewed changes

tests/recipe_specs.py Show resolved Hide resolved

petrelharp reviewed Jan 26, 2021

View reviewed changes

tests/test_annotation.py Show resolved Hide resolved

petrelharp reviewed Jan 26, 2021

View reviewed changes

tests/test_metadata.py Outdated Show resolved Hide resolved

Reworking tests using fixtures

b52d1c4

hyanwong force-pushed the refactor-tests branch from 6ba3ecd to b52d1c4 Compare January 26, 2021 10:25

petrelharp merged commit 4fa846b into tskit-dev:main Jan 26, 2021

petrelharp mentioned this pull request Jan 26, 2021

make tests parallelizable #114

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor tests #131

Refactor tests #131

hyanwong commented Jan 24, 2021 •

edited

Loading

hyanwong commented Jan 25, 2021

petrelharp Jan 25, 2021

hyanwong Jan 25, 2021

petrelharp Jan 25, 2021

hyanwong Jan 25, 2021

petrelharp Jan 25, 2021

hyanwong Jan 25, 2021

hyanwong Jan 25, 2021

petrelharp Jan 25, 2021

petrelharp Jan 25, 2021

petrelharp Jan 25, 2021

hyanwong Jan 25, 2021

hyanwong Jan 25, 2021

hyanwong Jan 25, 2021

petrelharp Jan 25, 2021 •

edited

Loading

petrelharp Jan 25, 2021

hyanwong Jan 25, 2021

petrelharp Jan 25, 2021

petrelharp commented Jan 25, 2021

hyanwong commented Jan 25, 2021

petrelharp commented Jan 25, 2021

hyanwong commented Jan 26, 2021

hyanwong commented Jan 26, 2021

codecov-io commented Jan 26, 2021 •

edited

Loading

petrelharp Jan 26, 2021

hyanwong Jan 26, 2021

petrelharp Jan 26, 2021

petrelharp commented Jan 26, 2021

hyanwong commented Jan 26, 2021

petrelharp commented Jan 26, 2021

		@@ -0,0 +1,229 @@
		generation stage individual age parent1 parent2

		@@ -0,0 +1,25 @@
		Install the dependencies using

		`pip install -r requirements/CI/requirements.txt`

Refactor tests #131

Refactor tests #131

Conversation

hyanwong commented Jan 24, 2021 • edited Loading

hyanwong commented Jan 25, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrelharp Jan 25, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrelharp commented Jan 25, 2021

hyanwong commented Jan 25, 2021

petrelharp commented Jan 25, 2021

hyanwong commented Jan 26, 2021

hyanwong commented Jan 26, 2021

codecov-io commented Jan 26, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrelharp commented Jan 26, 2021

hyanwong commented Jan 26, 2021

petrelharp commented Jan 26, 2021

hyanwong commented Jan 24, 2021 •

edited

Loading

petrelharp Jan 25, 2021 •

edited

Loading

codecov-io commented Jan 26, 2021 •

edited

Loading