Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support box plot in addition to violin plot #208

Merged
merged 8 commits into from
Jun 4, 2024
Merged

Support box plot in addition to violin plot #208

merged 8 commits into from
Jun 4, 2024

Conversation

TuomasBorman
Copy link
Contributor

Related to this issue: #207

This PR adds support for box plot. In addition to violin plot, user can also choose to visualize the data with box plot, This can be done by specifying layout = "box. I tried to follow your coding style, and this should be minimum modifications to get the support.

Here are examples on functionality

example_sce <- mockSCE()
example_sce <- logNormCounts(example_sce)

#
colData(example_sce) <- cbind(colData(example_sce), perCellQCMetrics(example_sce))

plots <- list()
plots[[1]] <- plotColData(example_sce, y = "Treatment", x = "sum", colour_by = "Mutation_Status", layout = "box")
plots[[2]] <- plotColData(example_sce, y = "Treatment", x = "sum", colour_by = "Mutation_Status")
plots[[3]] <- plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", layout = "test")
plots[[4]] <- plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", layout = "violin")
#
plots[[5]] <- plotExpression(example_sce, rownames(example_sce)[1:5])
plots[[6]] <- plotExpression(example_sce, c("Gene_0001", "Gene_0004"), x="Mutation_Status", layout = "violin")
plots[[7]] <- plotExpression(example_sce, rownames(example_sce)[1:5], layout = "box", point_alpha = 0.1, show_se = TRUE)
plots[[8]] <- plotExpression(example_sce, c("Gene_0001", "Gene_0004"), x="Mutation_Status", layout = "box", show_smooth = TRUE)
#
rowData(example_sce) <- cbind(rowData(example_sce), perFeatureQCMetrics(example_sce))

plots[[9]] <- plotRowData(example_sce, y="mean", show_median = TRUE)
plots[[10]] <- plotRowData(example_sce, y="mean", layout = "box", show_violin = TRUE)

library(patchwork)

wrap_plots(plots)

image

-Tuomas

@TuomasBorman
Copy link
Contributor Author

sce <- example_sce[1, ]
plots <- list()
plots[[1]] <- plotColData(sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", layout = "box")
plots[[2]] <- plotColData(sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", layout = "box", point_shape = NA)

wrap_plots(plots)

image

@alanocallaghan
Copy link
Owner

Thanks, looks good!

I would prefer to add a second boolean show_boxplot, because overlaying a boxplot on a violin plot can be useful (it gives you the median and quartiles, at least). Also, it fits with show_violin, whereas otherwise there's some redundancy between show_violin and layout.

This would mean making the boxplots narrower at least in the case when show_violin is TRUE along with show_boxplot, I think a width of 0.25 is a nice magic number I've used in the past but it may be worth experimenting a bit on example data.

@alanocallaghan
Copy link
Owner

An alternative would be to have a vector-valued argument, geoms that supports violin and boxplot for now but potentially others in future. However this would probably mean supporting and deprecating the existing version for a release cycle which might make for messier code

@TuomasBorman
Copy link
Contributor Author

show_boxplot seems to be better solution. Now the width of boxplot is 0.25 when violin plot is plotted, and I think it looks nice. I would keep the default width when violin is not plotted; it shows more clearly which points belongs to which group.

example_sce <- mockSCE()
example_sce <- logNormCounts(example_sce)

#
colData(example_sce) <- cbind(colData(example_sce), perCellQCMetrics(example_sce))

plots <- list()
plots[[1]] <- plotColData(example_sce, y = "Treatment", x = "sum", colour_by = "Mutation_Status", show_boxplot = TRUE)
plots[[2]] <- plotColData(example_sce, y = "Treatment", x = "sum", colour_by = "Mutation_Status")
plots[[3]] <- plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", show_boxplot = TRUE)
plots[[4]] <- plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", show_violin = FALSE)
#
example_sce <- example_sce[1:20, 1:20]
plots[[5]] <- plotExpression(example_sce, rownames(example_sce)[1:5])
plots[[6]] <- plotExpression(example_sce, c("Gene_0001", "Gene_0004"), x="Mutation_Status", show_violin = FALSE)

plots[[7]] <- plotExpression(example_sce, rownames(example_sce)[1:5], show_boxplot = TRUE, point_alpha = 0.1, show_se = TRUE)
plots[[8]] <- plotExpression(example_sce, c("Gene_0001", "Gene_0004"), x="Mutation_Status", show_boxplot = TRUE, show_smooth = TRUE, show_violin = FALSE)
#
rowData(example_sce) <- cbind(rowData(example_sce), perFeatureQCMetrics(example_sce))

plots[[9]] <- plotRowData(example_sce, y="mean", show_median = TRUE)
plots[[10]] <- plotRowData(example_sce, y="mean", show_boxplot = TRUE, show_violin = TRUE)

library(patchwork)

plots[[11]] <- plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", show_boxplot = TRUE)
plots[[12]] <- plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Mutation_Status", show_boxplot = TRUE, point_shape = NA, show_violin = FALSE)

wrap_plots(plots)

image

This figure below shows the boxplots when the width is always 0.25

image

@TuomasBorman
Copy link
Contributor Author

An alternative would be to have a vector-valued argument, geoms that supports violin and boxplot for now but potentially others in future. However this would probably mean supporting and deprecating the existing version for a release cycle which might make for messier code

That could be nice. It would make the code simpler when there are multiple choices. For these two (violin and box), I prefer the current solution as it is quite simple

@TuomasBorman
Copy link
Contributor Author

Is there way to disable coloring (I did not find with quick search, I'm in hurry currently.)? User might want to create just basic box plot without any colors.

image

R/plotExpression.R Outdated Show resolved Hide resolved
Copy link
Owner

@alanocallaghan alanocallaghan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! Can address the colour thing in a separate PR if necessary

@alanocallaghan alanocallaghan merged commit 096e367 into alanocallaghan:devel Jun 4, 2024
1 check passed
@alanocallaghan
Copy link
Owner

btw I hope it goes without saying but you don't need to add your email when adding yourself as a ctb, it's entirely up to you

@TuomasBorman
Copy link
Contributor Author

btw I hope it goes without saying but you don't need to add your email when adding yourself as a ctb, it's entirely up to you

Ahh yes. I did not pay attention on that, I just copy-pasted my information. I would rather remove my email, but I can do that if I do another PR on the coloring

@alanocallaghan
Copy link
Owner

Removed 35f3002

@TuomasBorman
Copy link
Contributor Author

Looks great, thanks! Can address the colour thing in a separate PR if necessary

There was already option for disabling colors, awesome

plotExpression(example_sce, rownames(example_sce)[1:5], show_boxplot= TRUE, feature_colors = FALSE, show_violin = FALSE)

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels