Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds possibility to make drop database #147

Draft
wants to merge 15 commits into
base: dev
Choose a base branch
from
Draft

Adds possibility to make drop database #147

wants to merge 15 commits into from

Conversation

Lucpen
Copy link
Collaborator

@Lucpen Lucpen commented Jul 30, 2024

PR checklist

  • Adds possibility to make drop database
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Jul 30, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit f3d2d0c

+| ✅ 167 tests passed       |+
#| ❔ 317 tests were ignored |#
!| ❗   9 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: .github/workflows/awstest.yml
  • files_exist - File not found: .github/workflows/awsfulltest.yml
  • nextflow_config - Config manifest.version should end in dev: 2.2.1
  • pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-09-17 05:56:48

@Lucpen
Copy link
Collaborator Author

Lucpen commented Jul 30, 2024

I haven't modified the CHANGELOG yet to avoid conflicts with the other this open PR

@Jakob37
Copy link

Jakob37 commented Aug 19, 2024

I'll give this a go now! Will let you know how it goes.

@Jakob37
Copy link

Jakob37 commented Aug 20, 2024

A first question. Should it be possible to generate the db without assigning the three DROP annotation parameters? Seems to me that it skipped DROP entirely when running without them.

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 20, 2024

It should work, those options are now optional :)

@Jakob37
Copy link

Jakob37 commented Aug 22, 2024

Looks like it is passing the DROP config building step now, nice! We will see whether it passes the AE and AS steps as well, will let you know.

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 22, 2024

Great 🤞

@Jakob37
Copy link

Jakob37 commented Aug 22, 2024

Some updates. It initially crashed. I recognize the error from my previous testing (gagneurlab/drop#568).

When debugging before, the error came from the columns with NA. These are translated to nan in the DROP R scripts, and crashing in different places.

Also, I think it needs the COUNT_MODE and COUNT_OVERLAPS to run the counting step.

After doing the following updates, it is running for me:

  • Removing GENE_COUNTS_FILE from the output (i.e. the column with NA values)
  • Setting COUNT_MODE to IntersectionStrict
  • Setting COUNT_OVERLAPS to True

After these changes it has been running fine for 15+ minutes. We'll see how it goes.

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 22, 2024

Nice! Did you just remove the GENE_COUNTS_FILE column or left it empty?

@Jakob37
Copy link

Jakob37 commented Aug 22, 2024

Nice! Did you just remove the GENE_COUNTS_FILE column or left it empty?

I totally removed it. Haven't tested what happens if leaving it empty (i.e. without the 'NA' strings).

@Jakob37
Copy link

Jakob37 commented Aug 22, 2024

The aberrant expression run was successful! I now see the geneCounts.tsv.gz in the processed_results/exported_counts dir.

I'll try the splicing module next.

Caching did not seem to work for the AE process though - it restarted when I did -resume. Haven't investigated why though, disabled it temporarily to try the splicing now.

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 22, 2024

Happy to hear that the AE module worked 😄
The caching does not work because the order in which the samples are provided to the annotation module varies, I will try to think of a way to prevent that form happening.

@Jakob37
Copy link

Jakob37 commented Aug 23, 2024

For the splicing, I am running into the same DROP error as I received previously. It seems there is a bug triggered when running with no external counts. Seems to be a silly issue, due to symlink and create a folder in the same location.

See issue: gagneurlab/drop#558

Have you managed to get around this somehow? 🤔

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 23, 2024

I have run without external counts but have not encountered this issue 🤔
Probably a stupid question, but does it happen consistently if you restart the snakemake run?

@Jakob37
Copy link

Jakob37 commented Aug 23, 2024

I have run without external counts but have not encountered this issue 🤔 Probably a stupid question, but does it happen consistently if you restart the snakemake run?

Hmm, strange. Was it a different version of DROP?

Yes, I still get it on reruns!

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 23, 2024

No, same version

@Jakob37
Copy link

Jakob37 commented Aug 28, 2024

I think I figured out the issue (see notes in gagneurlab/drop#558).

Seems the splicing workflow cannot handle a relative "root" path as it will yield an invalid softlink in one step.

I am trying now a run where I have asigned the path in drop_config.py as such:

config_copy["root"] = str(Path.cwd() / "output")

Before it was:

config_copy["root"] = "output"

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 28, 2024

Great! Let me know how it goes 😄

@Jakob37
Copy link

Jakob37 commented Aug 29, 2024

Now it passed that step! Still crashing though :(

In the step Error in rule AberrantSplicing_pipeline_FRASER_08_extract_results_FraseR_R:

The debugging continues ...

@Jakob37
Copy link

Jakob37 commented Aug 30, 2024

After updating the DROP container to version 1.4.0 the run went through 🎉

@Lucpen
Copy link
Collaborator Author

Lucpen commented Aug 30, 2024

Great! Thanks for all the help with the debugging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants