Merge external splicing counts #247

c-mertes · 2021-08-12T22:47:31Z

This is the fresh take on the merging of external splicing counts. #169

vyepez88 · 2021-12-01T12:31:27Z

drop/modules/aberrant-splicing-pipeline/Counting/01_4_countRNA_nonSplitReads_merge.R

@@ -50,3 +52,5 @@ nonSplitCounts <- getNonSplitReadCountsForAllSamples(fds=fds,
                                                     longRead=params$longRead)

 message(date(), ":", dataset, " nonSplit counts done")
+
+file.create(snakemake@output$done)


Suggested change

file.create(snakemake@output$done)

file.create(snakemake@output$done)

nickhsmith · 2022-03-24T13:40:16Z

drop/modules/aberrant-splicing-pipeline/Counting/03_filter_expression_FraseR.R

+        minExpressionInOneSample = minExpressionInOneSample,
+        minDeltaPsi = minDeltaPsi,
+        filter=FALSE)
+fds <- saveFraserDataSet(fds)


if using external counts save a new copy of the fds object
else use a symlink of the fds object

use the new copy/link in future processing

nickhsmith · 2022-04-21T09:24:54Z

docs/source/output.rst

+Results and Output of DROP
+===========================
+


follow slack comments

nickhsmith · 2022-04-21T09:37:42Z

drop/modules/aberrant-expression-pipeline/Counting/filterCounts.R

+has_external <- !(all(ods@colData$GENE_COUNTS_FILE == "") || is.null(ods@colData$GENE_COUNTS_FILE))
+if(has_external){
+    ods@colData$isExternal <- as.factor(ods@colData$GENE_COUNTS_FILE != "")
+}else{


comment to explain

nickhsmith · 2022-04-21T10:11:53Z

drop/modules/mae-pipeline/MAE/Results.R

 res[MAE == TRUE & MAE_ALT == FALSE, N_MAE_REF := .N, by = ID]
 res[MAE_ALT == TRUE, N_MAE_ALT := .N, by = ID]
 res[MAE == TRUE & MAE_ALT == FALSE & rare == TRUE, N_MAE_REF_RARE := .N, by = ID]
 res[MAE_ALT == TRUE & rare == TRUE, N_MAE_ALT_RARE := .N, by = ID]

 rd <- unique(res[,.(ID, N, N_MAE, N_MAE_REF, N_MAE_ALT, N_MAE_REF_RARE, N_MAE_ALT_RARE)])
+
+# rd contains duplicate entries for each ID. IE when MAE==F N_MAE for ID1 is both .N and 0
+# summarize these duplicates by taking the maximum of each column for each ID


c-mertes

Please adapt the code as discussed.

c-mertes · 2022-04-21T08:57:01Z

docs/source/output.rst

+DROP is intended to help researchers use RNA-Seq data in order to detect genes with aberrant expression,
+aberrant splicing and mono-allelic expression. By simplifying the workflow process we hope to provide
+easy to read and interpret html files and output files. This section is dedicated to explaining the relevant
+results files. We will use the results of the ``demo`` to explain the files generated.::


Suggested change

results files. We will use the results of the ``demo`` to explain the files generated.::

results files. We will use the results of the ``demo`` to explain the files generated by the following commands:

c-mertes · 2022-04-21T08:58:53Z

docs/source/output.rst

+Aberrant Expression
+++++++++++++++++++
+
+html file


Suggested change

html file

HTML file

c-mertes · 2022-04-21T08:59:17Z

docs/source/output.rst

+
+DROP is intended to help researchers use RNA-Seq data in order to detect genes with aberrant expression,
+aberrant splicing and mono-allelic expression. By simplifying the workflow process we hope to provide
+easy to read and interpret html files and output files. This section is dedicated to explaining the relevant


Suggested change

easy to read and interpret html files and output files. This section is dedicated to explaining the relevant

easy to read and interpret HTML files and output files. This section is dedicated to explaining the relevant

c-mertes · 2022-04-21T09:00:52Z

docs/source/output.rst

+
+* Counting Summaries 
+    * For each aberrant expression group
+        * split of local vs external sample counts


Suggested change

* split of local vs external sample counts

* number of local vs external sample

c-mertes · 2022-04-21T09:01:16Z

docs/source/output.rst

+        * information about the expressed genes within each sample and as a dataset
+* Outrider Summaries
+    * For each aberrant expression group
+        * the number of aberrantly expressed gene per sample


Suggested change

* the number of aberrantly expressed gene per sample

* the number of aberrantly expressed genes per sample

c-mertes · 2022-04-21T09:09:06Z

docs/source/output.rst

+* Files
+    * OUTRIDER files for each aberrant expression group
+        * For each of these files you can follow the `OUTRIDER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`_. 
+    * tsv files
+        * For each aberrant expression group
+            * results.tsv
+                * this tsv file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``


Suggested change

* Files

* OUTRIDER files for each aberrant expression group

* For each of these files you can follow the `OUTRIDER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`_.

* tsv files

* For each aberrant expression group

* results.tsv

* this tsv file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``

* Files (for each aberrant expression group)

* OUTRIDER data files (RDS)

* You can follow the `OUTRIDER vignette for further individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`.

* results files (TSV)

* the result file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``

c-mertes · 2022-04-21T09:10:31Z

docs/source/output.rst

+    * For each aberrant splicing group
+        * split of local (from internal BAM files) vs external sample counts
+        * split of local vs merged with external sample splicing/intron counts
+        * comparison of local and external log mean counts


Suggested change

* comparison of local and external log mean counts

* comparison of local and external mean counts

c-mertes · 2022-04-21T09:34:02Z

drop/demo/config_relative.yaml

@@ -16,13 +16,14 @@ exportCounts:
    - v29


Do we need to maintain it twice the file? In the code base and in the resource tar file?

it's not in the resource tar

c-mertes · 2022-04-21T09:34:23Z

drop/demo/sample_annotation_relative.tsv

@@ -1,23 +1,25 @@
-RNA_ID	RNA_BAM_FILE	DNA_VCF_FILE	DNA_ID	DROP_GROUP	PAIRED_END	COUNT_MODE	COUNT_OVERLAPS	STRAND	HPO_TERMS	GENE_COUNTS_FILE	GENE_ANNOTATION	GENOME
+RNA_ID	RNA_BAM_FILE	DNA_VCF_FILE	DNA_ID	DROP_GROUP	PAIRED_END	COUNT_MODE	COUNT_OVERLAPS	STRAND	HPO_TERMS	GENE_COUNTS_FILE	GENE_ANNOTATION	GENOME	SPLICE_COUNTS_DIR


Same as above, do we need to maintain it twice?

c-mertes · 2022-04-21T09:59:33Z

drop/modules/aberrant-splicing-pipeline/Counting/01_1_countRNA_splitReads_samplewise.R

@@ -6,13 +6,13 @@
 #'    - snakemake: '`sm str(tmp_dir / "AS" / "{dataset}" / "splitReads" / "{sample_id}.Rds")`'
 #'  params:
 #'   - setup: '`sm cfg.AS.getWorkdir() + "/config.R"`'
-#'   - workingDir: '`sm cfg.getProcessedDataDir() + "/aberrant_splicing/datasets"`'
+#'   - workingDir: '`sm cfg.getProcessedDataDir() + "/aberrant_splicing/datasets/fromBam"`'


use ..../datasets/raw-local-{dataset} raw-{dataset} {dataset}

c-mertes and others added 25 commits August 11, 2021 14:48

initial merge of external splicing counts for FRASER

4b6b0bf

fix download and add more test cases

e0e5844

fix test

7e59764

fix wget download and heatmap plotting

434135d

adapt to new naming of sampleannotation

d353a56

use only exact matching in subsetBy related to #244

2712658

fix merge of subsetGroups function related to: #246

f85e130

fix snakemake file dependency after merging external counts.

91566f6

correct naming

0cf5832

cleanup code

b1be9d3

update FRASER dependency for merge count functionality

6e0467c

Merge branch 'dev' into new_external_merge_splicing

48a0ab2

Merge branch 'dev'

b86f008

merge with dev

971a401

change input/output paths.

fdfd3cd

add symlinks

39744c9

add explicit biallelic filter

d7e0894

update regex matching

2f81989

snakemake 7 workarounds

d1f60cd

Merge branch 'small_fix' into new_external_merge_splicing

ce0a75f

Update to MAE filter scripts

a5f8de0

update backend for externalCounts

4dbcf0e

remove importExport for test

5c40c88

comments and cleanup

fa12be8

rename demo groups

ab7598f

vyepez88 previously approved these changes Apr 1, 2022

View reviewed changes

Smith Nicholas and others added 3 commits April 1, 2022 15:55

more information with external counts

4b385c3

Update README.md

9d68b2f

update with fdsMerge

a079606

nickhsmith dismissed vyepez88’s stale review via a079606 April 1, 2022 15:51

nickhsmith added 2 commits April 7, 2022 16:07

MAE results test

aa9dd7b

update test to match demo config

0141dc8

nickhsmith force-pushed the new_external_merge_splicing branch from 7debedc to 0141dc8 Compare April 7, 2022 20:24

nickhsmith and others added 14 commits April 8, 2022 13:24

allow for legacy sample annotation

c100b74

improve legacy handling

9e9d909

update FRASER version requiremtent

0eb78fc

fix column typo

ef65020

update plots to match config

0973a53

update

c363c11

Update README.md

59c4d2a

Clarifications added to possible QC values

4c83f2d

Update DNA_RNA_matrix_plot.R

420d31c

code review formatting fixes

2322f5f

Merge branch 'small_fix' into new_external_merge_splicing

eae85db

update docs

46df827

html outputs

4264ff0

MAE plot xlim

9daef0e

nickhsmith requested changes Apr 21, 2022

View reviewed changes

Merge branch 'dev' into new_external_merge_splicing

a04da62

c-mertes commented Apr 21, 2022

View reviewed changes

nickhsmith added 4 commits April 22, 2022 16:14

code-review fixes

5917caa

Update output.rst

2d220f8

Update output.rst

ea1b26d

Update output.rst

d15a702

nickhsmith previously approved these changes Apr 22, 2022

View reviewed changes

Update output.rst

1a628cb

vyepez88 dismissed nickhsmith’s stale review via 1a628cb April 22, 2022 14:57

nickhsmith approved these changes Apr 22, 2022

View reviewed changes

nickhsmith merged commit 8a88adb into dev Apr 22, 2022

nickhsmith deleted the new_external_merge_splicing branch April 22, 2022 16:35

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge external splicing counts #247

Merge external splicing counts #247

c-mertes commented Aug 12, 2021

vyepez88 Dec 1, 2021

nickhsmith Mar 24, 2022

nickhsmith Apr 21, 2022

nickhsmith Apr 21, 2022

nickhsmith Apr 21, 2022

c-mertes left a comment

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

nickhsmith Apr 22, 2022

c-mertes Apr 21, 2022

c-mertes Apr 21, 2022

	file.create(snakemake@output$done)
	file.create(snakemake@output$done)

	results files. We will use the results of the ``demo`` to explain the files generated.::
	results files. We will use the results of the ``demo`` to explain the files generated by the following commands:

	easy to read and interpret html files and output files. This section is dedicated to explaining the relevant
	easy to read and interpret HTML files and output files. This section is dedicated to explaining the relevant

	* split of local vs external sample counts
	* number of local vs external sample

	* the number of aberrantly expressed gene per sample
	* the number of aberrantly expressed genes per sample

	* comparison of local and external log mean counts
	* comparison of local and external mean counts

		@@ -1,23 +1,25 @@
		RNA_ID RNA_BAM_FILE DNA_VCF_FILE DNA_ID DROP_GROUP PAIRED_END COUNT_MODE COUNT_OVERLAPS STRAND HPO_TERMS GENE_COUNTS_FILE GENE_ANNOTATION GENOME
		RNA_ID RNA_BAM_FILE DNA_VCF_FILE DNA_ID DROP_GROUP PAIRED_END COUNT_MODE COUNT_OVERLAPS STRAND HPO_TERMS GENE_COUNTS_FILE GENE_ANNOTATION GENOME SPLICE_COUNTS_DIR

Merge external splicing counts #247

Merge external splicing counts #247

Conversation

c-mertes commented Aug 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c-mertes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment