Pipeline crashed when no variant is found #14

mfoll · 2015-09-11T12:23:34Z

It seems that when the process R_regression does not produce variants, as it does not produce pdf output (but the errorStrategy 'ignore' makes that acceptable), the empty vcf is not sent to the vcf channel:

output:
     file "${region_tag}.vcf" into vcf
     file '*.pdf' into PDF

This creates an error in the collect_vcf_result process as there is no vcf there.

The text was updated successfully, but these errors were encountered:

tdelhomme · 2015-09-16T14:32:38Z

Do you have any example file to test this?

mfoll · 2015-09-16T15:09:06Z

If you use the tiny bed file on this test data set it will crash:

git clone --depth=1 https://github.com/mfoll/NGS_data_test.git
cd NGS_data_test/1000G_CEU_TP53/
nextflow run mfoll/robust-regression-caller -with-docker mfoll/robust-regression-caller --bed \ 
               TP53_tiny.bed --bam_folder BAM/ --fasta_ref 17.fasta.gz

We can either:

Add errorStrategy 'ignore'in collect_vcf_result: but the pipeline will output nothing when no variant is found, and I don't like too much using this option as if a real error occurs it's harder to spot
Check if there is no vcf file produced and in this case output an empty vcf
Ask if nextflow people could change this behaviour (when one output is missing, still send the others to their channels). This would be my favorite option of course.

mfoll · 2015-09-16T15:44:51Z

@pditommaso suggestion:

the easiest way to handle this is creating an empty pdf file in the BASH script
and eventually filtering out the empty file from pdf channel
something like this
PDF.filter { it.size() > 0 }.set { pdf_2 }

mfoll · 2015-09-16T15:46:38Z

That's complicating the pipeline just to produce an empty vcf output... I would rather prefer option 2 I mentioned above.

pditommaso · 2015-09-16T16:01:19Z

As far as I'm understanding this happens only with the test data. Is it not possible to create a test dataset producing at least one entry in the vcf?

mfoll · 2015-09-16T16:05:25Z

No, it happened to me on real data and then I created a test replicating the issue.

pditommaso · 2015-09-16T16:09:57Z

Thus, when the pileup_nbrr_caller_vcf.r script creates an empty vcf and in this case it does not create the pdf file.

In my opinion for consistency it should create an empty pdf file (or none of them).

Creates an empty pdf in all cases and delete them afterward

Solves #14

mfoll · 2015-09-17T15:19:26Z

I am trying to delete the empty pdf files. But the empty pdf is only deleted when it is the only pdf produced. There must be an error in this line: https://github.com/mfoll/robust-regression-caller/blob/dev/samtools_regression_somatic_vcf.nf#L149
@pditommaso can you help please?

pditommaso · 2015-09-17T15:24:59Z

I'm a bit confused about that code. Why do you need a PDF output channel if it is not consumed by nobody?

mfoll · 2015-09-17T15:29:16Z

To copy it where I want it to be using storeDir, but maybe there is another way? And actually if I remember well, omitting into PDF leads to an infinite loop in nextflow without any error message (but I need to double check that).

mfoll · 2015-09-17T15:41:25Z

Ok I was wrong, the into PDF is not necessary and not leading to an infinite loop.
But I do need it to delete the empty pdf file that I produce. So do you know what is wrong with PDF.filter { it.size() == 0 }.subscribe { it.delete() }?

pditommaso · 2015-09-17T15:42:07Z

Does it report an error message? what's the problem?

mfoll · 2015-09-17T15:43:50Z

No, but when the process R_regression produces a single pdf output (the empty one I create here) it is properly deleted, but when the process R_regression also creates others, it's not deleted.

mfoll · 2015-09-17T15:50:45Z

If you want to replicate the issue you can use my test data:

git clone --depth=1 https://github.com/mfoll/NGS_data_test.git
cd NGS_data_test/1000G_CEU_TP53/

Then:

nextflow run mfoll/robust-regression-caller -with-docker mfoll/robust-regression-caller --bed TP53_tiny.bed --bam_folder BAM/ --fasta_ref 17.fasta.gz

Creates the empty pdf and deletes it (absent from BAM/VCF).

nextflow run mfoll/robust-regression-caller -with-docker mfoll/robust-regression-caller --bed TP53_exon2_11.bed --bam_folder BAM/ --fasta_ref 17.fasta.gz

Creates the empty pdf and others, and then doesn't delete the empty one (present in BAM/VCF).

pditommaso · 2015-09-17T15:52:06Z

OK, I think the problem is that when there's more than a pdf, the PDF items are list objects, thus it.size() returns the size of the list instead of the file. You should declare it as flatten.

Said that, I've noticed that you have used storeDir on any process. Are you aware that it will cause all following runs of the pipeline to skip the execution of that processes?

Resolves #14

mfoll · 2015-09-18T09:57:32Z

Sorry @pditommaso I should have figured this out by myself (you should stop being so helpful, or users like me will become lazy!). It's working perfectly now.

Yes I know the behaviour with storeDir and I actually like it. It's true I could remove some in the production version of the pipeline, but it's really helpful at the moment to keep all intermediate files outside the workfolder in an organised way. It's also very convenient for restarting the pipeline at a given point: I just need to delete the folders created after this point and nextflow automatically skips the previous steps. Maybe something to add in nextflow would be an option in the command line to ignore and overwrite the existing files if they exist in this case (like the opposite to the resume option).

mfoll self-assigned this Sep 11, 2015

mfoll added help wanted bug labels Sep 11, 2015

mfoll added this to the First official release v0.1 milestone Sep 16, 2015

mfoll added a commit that referenced this issue Sep 17, 2015

Solves issue #14

246534a

Creates an empty pdf in all cases and delete them afterward

mfoll referenced this issue Sep 17, 2015

Update samtools_regression_somatic_vcf.nf

49e9fa0

mfoll mentioned this issue Sep 17, 2015

Resolves #14 #17

Merged

tdelhomme added a commit that referenced this issue Sep 17, 2015

Merge pull request #17 from mfoll/dev

73c1141

Solves #14

mfoll closed this as completed Sep 17, 2015

mfoll reopened this Sep 17, 2015

mfoll added a commit that referenced this issue Sep 18, 2015

Resolves #14

8c761b3

mfoll mentioned this issue Sep 18, 2015

Resolves #14 #19

Merged

mfoll closed this as completed in #19 Sep 18, 2015

mfoll added a commit that referenced this issue Sep 18, 2015

Merge pull request #19 from mfoll/dev

f2b1e11

Resolves #14

mfoll added the minor bug label Sep 18, 2015

mfoll removed the bug label Apr 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline crashed when no variant is found #14

Pipeline crashed when no variant is found #14

mfoll commented Sep 11, 2015

tdelhomme commented Sep 16, 2015

mfoll commented Sep 16, 2015

mfoll commented Sep 16, 2015

mfoll commented Sep 16, 2015

pditommaso commented Sep 16, 2015

mfoll commented Sep 16, 2015

pditommaso commented Sep 16, 2015

mfoll commented Sep 17, 2015

pditommaso commented Sep 17, 2015

mfoll commented Sep 17, 2015

mfoll commented Sep 17, 2015

pditommaso commented Sep 17, 2015

mfoll commented Sep 17, 2015

mfoll commented Sep 17, 2015

pditommaso commented Sep 17, 2015

mfoll commented Sep 18, 2015

Pipeline crashed when no variant is found #14

Pipeline crashed when no variant is found #14

Comments

mfoll commented Sep 11, 2015

tdelhomme commented Sep 16, 2015

mfoll commented Sep 16, 2015

mfoll commented Sep 16, 2015

mfoll commented Sep 16, 2015

pditommaso commented Sep 16, 2015

mfoll commented Sep 16, 2015

pditommaso commented Sep 16, 2015

mfoll commented Sep 17, 2015

pditommaso commented Sep 17, 2015

mfoll commented Sep 17, 2015

mfoll commented Sep 17, 2015

pditommaso commented Sep 17, 2015

mfoll commented Sep 17, 2015

mfoll commented Sep 17, 2015

pditommaso commented Sep 17, 2015

mfoll commented Sep 18, 2015