Skip to content

Commit

Permalink
Merge pull request #449 from gagneurlab/dev
Browse files Browse the repository at this point in the history
Update README.md
  • Loading branch information
vyepez88 committed Apr 7, 2023
2 parents 1b98b37 + 53adb7e commit 8a6a490
Show file tree
Hide file tree
Showing 14 changed files with 78 additions and 53 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ mamba create -n drop_env -c conda-forge -c bioconda drop --override-channels

In the case of mamba/conda troubles we recommend using the fixed `DROP_<version>.yaml` installation file we make available on our [public server](https://www.cmm.in.tum.de/public/paper/drop_analysis/). Install the current version and use the full path in the following command to install the conda environment `drop_env`
```
mamba env create -f DROP_1.2.3.yaml
mamba env create -f DROP_1.3.2.yaml
```

Test installation with demo project
Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
author = 'Michaela Müller'

# The full version, including alpha/beta/rc tags
release_ = '1.3.1'
release_ = '1.3.2'



Expand Down
2 changes: 1 addition & 1 deletion docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Install the latest version and use the full path in the following command to ins

.. code-block:: bash
mamba env create -f DROP_1.3.1.yaml
mamba env create -f DROP_1.3.2.yaml
Installation time: ~ 10min

Expand Down
1 change: 1 addition & 0 deletions docs/source/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Subworkflow Description
``aberrantExpression`` Aberrant expression pipeline
``aberrantSplicing`` Aberrant splicing pipeline
``mae`` Monoallelic expression pipeline
``sampleQC`` DNA-RNA matching (already part of mae, but can be executed independently as well)
``rnaVariantCalling`` RNA Variant Calling pipeline
======================== =======================================================================

Expand Down
2 changes: 1 addition & 1 deletion drop/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
from . import utils
from . import demo

__version__ = "1.3.1"
__version__ = "1.3.2"

2 changes: 1 addition & 1 deletion drop/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

@click.group()
@click_log.simple_verbosity_option(logger)
@click.version_option('1.3.1',prog_name='drop')
@click.version_option('1.3.2',prog_name='drop')


def main():
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ suppressPackageStartupMessages({
ods <- readRDS(snakemake@input$ods)

has_external <- any(as.logical(colData(ods)$isExternal))
cnts_mtx_local <- counts(ods, normalized = F)[,!as.logical(ods@colData$isExternal)]
cnts_mtx_local <- counts(ods, normalized = F)[,!as.logical(ods@colData$isExternal),drop=FALSE]
cnts_mtx <- counts(ods, normalized = F)

#' ## Number of samples:
Expand Down
93 changes: 56 additions & 37 deletions drop/modules/aberrant-splicing-pipeline/Counting/Summary.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,54 +44,73 @@ devNull <- saveFraserDataSet(fdsMerge,dir=workingDir, name=paste0("raw-", datase
#' Local: `r sum(!as.logical(fdsMerge@colData$isExternal))`
#' External: `r sum(as.logical(fdsMerge@colData$isExternal))`
#'
#' ## Number of introns:
#' Local (before filtering): `r length(rowRanges(fdsLocal, type = "j"))`
#' ```{asis, echo = has_external}
#' Merged with external counts (before filtering):
#' ```
#' ```{r, eval = has_external, echo=FALSE}
#' length(rowRanges(fdsMerge, type = "j"))
#' ```
#' After filtering: `r sum(mcols(fdsMerge, type="j")[,"passed"])`
#'
#' ## Number of splice sites:
#' Local: `r length(rowRanges(fdsLocal, type = "theta"))`
#' ```{asis, echo = has_external}
#' Merged with external counts:
#' ```
#' ```{r, eval = has_external, echo=FALSE}
#' length(rowRanges(fdsMerge, type = "theta"))
#' ```
#'

#' ```{asis, echo = has_external}
#' ## Comparison of local and external counts
#' **Using external counts**
#' External counts introduce some complexity into the problem of counting junctions
#' because it is unknown whether or not a junction is not counted (because there are no reads)
#' compared to filtered and not present due to legal/personal sharing reasons. As a result,
#' after merging the local (counted from BAM files) counts and the external counts, only the junctions that are
#' present in both remain. As a result it is likely that the number of junctions will decrease after merging.
#'
#' ```
#' ```{r, eval = has_external, echo=has_external}
#' if(has_external){
#' externalCountIDs <- colData(fdsMerge)[as.logical(colData(fdsMerge)[,"isExternal"]),"sampleID"]
#' localCountIDs <- colData(fdsMerge)[!as.logical(colData(fdsMerge)[,"isExternal"]),"sampleID"]
#'
#' ### Number of introns (psi5 or psi3) before and after merging:
#' Local: `r length(rowRanges(fdsLocal, type = "psi5"))`
#' Merged: `r length(rowRanges(fdsMerge, type = "psi5"))`
#' cts <- K(fdsMerge,"psi5")
#' ctsLocal<- cts[,localCountIDs,drop=FALSE]
#' ctsExt<- cts[,externalCountIDs,drop=FALSE]
#'
#' ### Number of splice sites (theta) before and after merging:
#' Local: `r length(rowRanges(fdsLocal, type = "theta"))`
#' Merged: `r length(rowRanges(fdsMerge, type = "theta"))`
#' rowMeanLocal <- rowMeans(ctsLocal)
#' rowMeanExt <- rowMeans(ctsExt)
#'
#' dt <- data.table("Mean counts of local samples" = rowMeanLocal,
#' "Mean counts of external samples" = rowMeanExt)
#'
#' ggplot(dt,aes(x = `Mean counts of local samples`, y= `Mean counts of external samples`)) +
#' geom_hex() + theme_cowplot(font_size = 16) +
#' theme_bw() + scale_x_log10() + scale_y_log10() +
#' geom_abline(slope = 1, intercept =0) +
#' scale_color_brewer(palette="Dark2")
#' }
#' ```
#'

#' ### Comparison of local and external counts
if(has_external){
externalCountIDs <- colData(fdsMerge)[as.logical(colData(fdsMerge)[,"isExternal"]),"sampleID"]
localCountIDs <- colData(fdsMerge)[!as.logical(colData(fdsMerge)[,"isExternal"]),"sampleID"]

cts <- K(fdsMerge,"psi5")
ctsLocal<- cts[,localCountIDs,drop=FALSE]
ctsExt<- cts[,externalCountIDs,drop=FALSE]

rowMeanLocal <- rowMeans(ctsLocal)
rowMeanExt <- rowMeans(ctsExt)

dt <- data.table("Mean counts of local samples" = rowMeanLocal,
"Mean counts of external samples" = rowMeanExt)

ggplot(dt,aes(x = `Mean counts of local samples`, y= `Mean counts of external samples`)) +
geom_hex() + theme_cowplot(font_size = 16) +
theme_bw() + scale_x_log10() + scale_y_log10() +
geom_abline(slope = 1, intercept =0) +
scale_color_brewer(palette="Dark2")
}else{
print("No external counts, comparison is ommitted")
}

#' ## Expression filtering
#' Min expression cutoff: `r snakemake@config$aberrantSplicing$minExpressionInOneSample`
plotFilterExpression(fdsMerge) + theme_cowplot(font_size = 16)
#' The expression filtering step removes introns that are lowly expressed. The requirements for an intron to pass this filter are:
#'
#' * at least 1 sample has `r snakemake@config$aberrantSplicing$minExpressionInOneSample` counts (K) for the intron
#' * at least `r 100*(1-snakemake@config$aberrantSplicing$quantileForFiltering)`% of the samples need to have a total of at least `r snakemake@config$aberrantSplicing$quantileMinExpression` reads for the splice metric denominator (N) of the intron
plotFilterExpression(fdsMerge) +
labs(title="", x="Mean Intron Expression", y="Introns") +
theme_cowplot(font_size = 16)

#' ## Variability filtering
#' Variability cutoff: `r snakemake@config$aberrantSplicing$minDeltaPsi`
plotFilterVariability(fdsMerge) + theme_cowplot(font_size = 16)

#' Introns that passed filter (after merging)
table(mcols(fdsMerge, type="j")[,"passed"])
#' The variability filtering step removes introns that have no or little variability in the splice metric values across samples. The requirement for an intron to pass this filter is:
#'
#' * at least 1 sample has a difference of at least `r snakemake@config$aberrantSplicing$minDeltaPsi` in the splice metric compared to the mean splice metric of the intron
plotFilterVariability(fdsMerge) +
labs(title="", y="Introns") +
theme_cowplot(font_size = 16)
6 changes: 3 additions & 3 deletions drop/modules/aberrant-splicing-pipeline/FRASER/Summary.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@ hasExternal <- length(levels(colData(fds)$isExternal) > 1)

#' Number of samples: `r nrow(colData(fds))`
#'
#' Number of introns: `r length(rowRanges(fds, type = "psi5"))`
#' Number of introns: `r length(rowRanges(fds, type = "j"))`
#'
#' Number of splice sites: `r length(rowRanges(fds, type = "theta"))`
#' Number of splice sites: `r length(rowRanges(fds, type = "ss"))`

# used for most plots
dataset_title <- paste0("Dataset: ", dataset, "--", annotation)
Expand Down Expand Up @@ -70,7 +70,7 @@ topJ <- 10000
anno_color_scheme <- brewer.pal(n = 3, name = 'Dark2')[1:2]

for(type in psiTypes){
for(normalized in c(T,F)){
for(normalized in c(F,T)){
hm <- plotCountCorHeatmap(
object=fds,
type = type,
Expand Down
2 changes: 1 addition & 1 deletion drop/modules/aberrant-splicing-pipeline/config.R
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ extract_params <- function(params) {
unlist(params)[1]
}

options("FRASER.maxSamplesNoHDF5"=1)
options("FRASER.maxSamplesNoHDF5"=0)
options("FRASER.maxJunctionsNoHDF5"=-1)

h5disableFileLocking()
Expand Down
11 changes: 8 additions & 3 deletions drop/modules/mae-pipeline/QC/DNA_RNA_matrix_plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -95,9 +95,14 @@ ann_colors = list(
ann_colors[['status']] <- ann_colors[['status']][unique(c(dna_df$status, rna_df$status))]

#+ Heatmap, fig.height=6, fig.width=8
pheatmap(qc_mat, color = color, cluster_rows = FALSE, cluster_cols = FALSE,
annotation_row = dna_df, annotation_col = rna_df, annotation_colors = ann_colors,
labels_row = 'DNA samples', labels_col = 'RNA samples', angle_col = 0)
if(nrow(qc_mat) > 1 || ncol(qc_mat) > 1){
pheatmap(qc_mat, color = color, cluster_rows = FALSE, cluster_cols = FALSE,
annotation_row = dna_df, annotation_col = rna_df, annotation_colors = ann_colors,
labels_row = 'DNA samples', labels_col = 'RNA samples', angle_col = 0)
} else {
print("No heatmap created as only 1 sample is provided.")
print(qc_mat)
}


#' ## Identify matching samples
Expand Down
2 changes: 1 addition & 1 deletion drop/modules/mae-pipeline/QC/create_matrix_dna_rna_cor.R
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ lp <- bplapply(1:N, function(i){
mat <- do.call(rbind, lp)
row.names(mat) <- dna_samples
colnames(mat) <- rna_samples
mat <- mat[sa[rows_in_group, DNA_ID], sa[rows_in_group, RNA_ID]]
mat <- mat[sa[rows_in_group, DNA_ID], sa[rows_in_group, RNA_ID],drop=FALSE]

saveRDS(mat, snakemake@output$mat_qc)

2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.3.1
current_version = 1.3.2
commit = True

[bumpversion:file:setup.py]
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

setuptools.setup(
name="drop",
version="1.3.1",
version="1.3.2",

author="Vicente A. Yépez, Michaela Müller, Nicholas H. Smith, Daniela Klaproth-Andrade, Luise Schuller, Ines Scheller, Christian Mertes <mertes@in.tum.de>, Julien Gagneur <gagneur@in.tum.de>",
author_email="yepez@in.tum.de",
Expand Down

0 comments on commit 8a6a490

Please sign in to comment.