In the absence of a bed file the pipeline should run on the full reference genome #39

mfoll · 2015-10-06T07:47:51Z

bedtools has a function makewindows that can create a bed from a fasta index file with windows of a given size. For example split the whole genome in 10Mb regions:

bedtools makewindows -g reference.fasta.fai -w 10000000

I could actually also use this function to split the bed instead/in combination with my own R script.

mfoll · 2015-10-06T09:43:30Z

We could use conditional processes like here

mfoll · 2015-10-08T07:28:14Z

The window size should be automatically calculated from nsplit and the total size of the genome It's easy to get from the fai file with Rscript in bash for example to cut the genome in 500:

windows_size=$(Rscript -e "cat(sum(as.numeric(read.table(\"reference.fasta.fai\")[,2]))/500,\"\n\")")
echo $windows_size
6191388

So in nextflow script it will appear as:

windows_size=$(Rscript -e "cat(sum(as.numeric(read.table(\"!{fasta_ref_fai}\")[,2]))/!{params.nsplit,\"\n\")")

mfoll · 2015-12-11T15:32:32Z

See https://github.com/hall-lab/speedseq#annotations

mfoll added this to the v0.3 milestone Oct 6, 2015

mfoll added the enhancement label Oct 6, 2015

mfoll self-assigned this Oct 6, 2015

mfoll mentioned this issue Oct 8, 2015

Improve the bed split method #47

Closed

mfoll assigned tdelhomme and unassigned mfoll Oct 8, 2015

mfoll added the ready label Dec 17, 2015

tdelhomme mentioned this issue Jan 28, 2016

calling on region or WG possible #84

Merged

tdelhomme closed this as completed Jan 28, 2016

mfoll removed the ready label Jan 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In the absence of a bed file the pipeline should run on the full reference genome #39

In the absence of a bed file the pipeline should run on the full reference genome #39

mfoll commented Oct 6, 2015

mfoll commented Oct 6, 2015

mfoll commented Oct 8, 2015

mfoll commented Dec 11, 2015

In the absence of a bed file the pipeline should run on the full reference genome #39

In the absence of a bed file the pipeline should run on the full reference genome #39

Comments

mfoll commented Oct 6, 2015

mfoll commented Oct 6, 2015

mfoll commented Oct 8, 2015

mfoll commented Dec 11, 2015