Allow running DROP without reference samples #152

Jakob37 · 2024-08-14T11:45:41Z

Description of feature

Right now it looks like the drop_sample_annot.py expects a reference.

In this case, I have a large run with 100 samples, which would serve as their own reference.

It does not seem to be supported currently. If not supplying an external reference, DROP will not run.

I am working around this by supplying an empty reference and making some changes to drop_sample_annot.py to not crash it if no data is present in the df. It would be helpful to have an "official" way to do this.

What do you think?

Sorry about the issue bombardment today 🫣

The text was updated successfully, but these errors were encountered:

Lucpen · 2024-08-14T12:24:03Z

No worries, we are happy to get issues and improve the pipeline 😄
At the moment Tomte is designed to run only a few samples with an already existing database, not to create one. However, we have been working on modifying the code so that an actual database can be created. Here is the PR, we still need to test it thoroughly but if you want to give it a try, feel free to run it and to make any suggestions on how to improve it.

Jakob37 · 2024-08-14T13:22:56Z

No worries, we are happy to get issues and improve the pipeline 😄 At the moment Tomte is designed to run only a few samples with an already existing database, not to create one. However, we have been working on modifying the code so that an actual database can be created. Here is the PR, we still need to test it thoroughly but if you want to give it a try, feel free to run it and to make any suggestions on how to improve it.

OK, great! I'll give it a go. That sounds exactly like what we will need ahead.

Managed to get the DROP run started anyway without any reference db. We will see how that goes ...

Lucpen · 2024-08-14T13:39:06Z

Please, let me know if it works, and if it doesn't it will be better if you restart DROP outside from the pipeline as explained here

Jakob37 · 2024-08-15T05:23:10Z

Please, let me know if it works, and if it doesn't it will be better if you restart DROP outside from the pipeline as explained here

Thanks for the tips! That will be very helpful.

It made it pretty far (edit: not super far, a little bit), into the Counting_Summary step. Will see if I can figure that out today 🤔

Jakob37 · 2024-08-16T05:48:57Z

The aberrant expression run went through 🎉 I needed to remove the following cols from the produced sample_annot.tsv file: GENE_COUNTS_FILE, SEX. Otherwise both were produced filled with NA values, which DROP downstream could not handle. Seems its R parsing 'cleverly' translates string "NA" to nan.

I raised an issue in DROP about it: gagneurlab/drop#568

Still running the splicing run. It filled our RAM when running with 64 threads, but seems to be doing fine on a smaller number of threads (12). Might require some further fiddling with the sample_annot.tsv file to get it through downstream steps I guess, we will see.

Jakob37 · 2024-08-16T05:53:31Z

Have you guys btw considered running OUTRIDER and FRASER2 outside the DROP pipeline? Seems the handful of steps could be lifted over from Snakemake to one or two Nextflow subworkflows. This would make things much cleaner with debugging, resuming caching, less dependencies on DROP..

I realize this would mean considerable extra work to set up, and it might not be feasible. Just a thought!

Jakob37 · 2024-08-16T11:22:52Z

FRASER2 pipeline ran pretty far, crashed in one of the final FRASER calculation steps. Seems to be a bug only appearing when not using external counts: gagneurlab/drop#558

Jakob37 added the enhancement New feature or request label Aug 14, 2024

Lucpen linked a pull request Aug 14, 2024 that will close this issue

Adds possibility to make drop database #147

Draft

10 tasks

Jakob37 closed this as completed Aug 14, 2024

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow running DROP without reference samples #152

Allow running DROP without reference samples #152

Jakob37 commented Aug 14, 2024

Lucpen commented Aug 14, 2024 •

edited

Loading

Jakob37 commented Aug 14, 2024

Lucpen commented Aug 14, 2024

Jakob37 commented Aug 15, 2024 •

edited

Loading

Jakob37 commented Aug 16, 2024 •

edited

Loading

Jakob37 commented Aug 16, 2024

Jakob37 commented Aug 16, 2024

Allow running DROP without reference samples #152

Allow running DROP without reference samples #152

Comments

Jakob37 commented Aug 14, 2024

Description of feature

Lucpen commented Aug 14, 2024 • edited Loading

Jakob37 commented Aug 14, 2024

Lucpen commented Aug 14, 2024

Jakob37 commented Aug 15, 2024 • edited Loading

Jakob37 commented Aug 16, 2024 • edited Loading

Jakob37 commented Aug 16, 2024

Jakob37 commented Aug 16, 2024

Lucpen commented Aug 14, 2024 •

edited

Loading

Jakob37 commented Aug 15, 2024 •

edited

Loading

Jakob37 commented Aug 16, 2024 •

edited

Loading