Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialise plotMediation method #165

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from
Open

Initialise plotMediation method #165

wants to merge 1 commit into from

Conversation

RiboRings
Copy link
Member

Hi!

This PR introduces the plotMediation method meant to visualise results from mia functions getMediation and addMediation.

It is based on ComplexHeatmap but if you want to keep miaViz depend only on ggplot I can change to ggplot.

It would be nice to have a barplot option but it complicates the method because data like confidence intervals is not stored directly in the main output.

Feel free to give your thoughts!

Copy link
Member

@antagomir antagomir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me, I just added some comments.

Comment on lines +90 to +95
if(p_mat[i, j] < 0.001) {
grid.text("***", k, y)
} else if(p_mat[i, j] < 0.01) {
grid.text("**", k, y)
} else if(p_mat[i, j] < 0.05) {
grid.text("*", k, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could / should these thresholds be something that the user could define? and these ones here could be the default values? It could be for instance one argument, like signif.threshold = c(0.001, 0.01, 0.05) or if it is just signif.threshold = c(0.01, 0.05) with two values, then only two levels of stars etc?

Comment on lines +103 to +107
p <- Heatmap(coef_mat, name = "Effect",
cluster_rows = FALSE, cluster_columns = FALSE,
row_names_side = "left", column_names_side = "top",
column_names_rot = 0, rect_gp = gpar(col = "white", lwd = 2),
cell_fun = cell_fun)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression is that it would be good if users can pass these arguments, so these would come via the "..." mechanism in the function call? And the manpage could mention that the ComplexHeatmap::Heatmap arguments are supported. Or at least some critical ones.

@antagomir
Copy link
Member

imo ComplexHeatmap could be added.

Copy link
Contributor

@TuomasBorman TuomasBorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not add ComplexHeatmap as dependency.

  1. This can be achieved with ggplot2
  2. Even though ComplexHeatmap is easier to plot, it is harder to modify afterwards
  3. It causes lack of standardization (I think all the plots in miaViz should be done with ggplot as it is the gold standard, if possible)
  4. It makes it confusing if the package outputs plots in different formats

Comment on lines +69 to +72
#' @rdname plotMediation
#' @export
setGeneric("plotMediation", function(x, ...) standardGeneric("plotMediation"))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add generics to AllGenerics.R

@TuomasBorman
Copy link
Contributor

This is the ComplexHeatmap plot. It looks good, but the color scale is little bit misleading. I think it should be scaled to 0 (0 = white)

image

@TuomasBorman
Copy link
Contributor

This is a script that does the same in ggplot2. Please modify and improve it to fit the function

devtools::load_all()

library(mia)
library(scater)

# Load dataset
data(hitchip1006, package = "miaTime")
tse <- hitchip1006

# Agglomerate features by family (merely to speed up execution)
tse <- agglomerateByRank(tse, rank = "Phylum")
# Convert BMI variable to numeric
tse$bmi_group <- as.numeric(tse$bmi_group)

# Apply clr transformation to counts assay
tse <- transformAssay(tse,
                      method = "clr",
                      pseudocount = 1)

# Analyse mediated effect of nationality on BMI via clr-transformed features
# 100 permutations were done to speed up execution, but ~1000 are recommended
tse <- addMediation(tse, name = "assay_mediation",
                    outcome = "bmi_group",
                    treatment = "nationality",
                    assay.type = "clr",
                    covariates = c("sex", "age"),
                    treat.value = "Scandinavia",
                    control.value = "CentralEurope",
                    boot = TRUE, sims = 100,
                    p.adj.method = "fdr")
###############################################################################
# Prepare data
df <- metadata(tse)[["assay_mediation"]]
df

library(tidyr)
df <- df %>%
    pivot_longer(
        cols = c(ACME_estimate, ADE_estimate, ACME_pval, ADE_pval),
        names_to = c("Condition", "Metric"),
        names_sep = "_"
    ) %>%
    pivot_wider(
        names_from = Metric,
        values_from = value
    )

df[["Mediator"]] <- factor(df[["Mediator"]], levels = rev(unique(df[["Mediator"]])))

df[["pval_symbol"]] <- ifelse(df[["pval"]] < 0.001, "***", ifelse(df[["pval"]] < 0.01, "**", ifelse(df[["pval"]] <= 0.05, "*", "")))

##############################################
# Heatmap
library(ggplot2)

# Create heatmap
ggplot(df, aes(x = Condition, y = Mediator, fill = estimate)) +
    geom_tile(color = "white", lwd = 1.5, linetype = 1) +
    # scale_fill_gradient2(low = "blue", mid = "white", high = "red", limits = c(-max(abs(df[["value"]])), max(abs(df[["value"]]))), name = "Effect")
    scale_fill_gradientn(colours = c("darkblue", "blue", "white", "red", "darkred"), # Colors
                         values = scales::rescale(c(-0.4, -0.3, -0.2, -0.1, 0.1)),  # Map white to -0.2
                         limits = c(-0.4, 0.1), # Overall range of the scale
                         breaks = c(-0.4, -0.3, -0.2, -0.1, 0), # Tick marks on the color bar
                         name = "Effect"
    ) +  theme_minimal() +
    scale_x_discrete(position = "top") + labs(x = "", y = "") +
    geom_text(aes(label = pval_symbol), color = "black", size = 5) +
    theme(axis.text.y = element_text(size = 12), # Customize y-axis text size
          axis.text.x = element_text(size = 12))

image

@TuomasBorman
Copy link
Contributor

We were also talking about forests plots. That can be achieved with this (same again, please make it so that it can be generalized for the function):

# Forest plot

med_df <- getMediation(
    tse,
    outcome = "bmi_group",
    treatment = "nationality",
    mediator = "diversity",
    covariates = c("sex", "age"),
    treat.value = "Scandinavia",
    control.value = "CentralEurope",
    boot = TRUE, sims = 100,
    add.metadata = TRUE)
plot(attr(med_df, "metadata")[[1]])
dat <- attr(med_df, "metadata")[[1]]

estimates <- c("Total\nEffect" = "tau", "ADE" = "z0", "ACME" = "d0")
df <- lapply(estimates, function(x){
    temp <- dat[ paste0(x, c("", ".coef", ".ci")) ]
    temp <- unlist(temp, use.names = FALSE)
    return(temp)
})
df <- do.call(rbind, df) |> as.data.frame()
colnames(df) <- c("est", "lower", "upper")
estmates <- 
df[["estimates"]] <- factor(names(estimates), levels = names(estimates))

ggplot(df, aes(x = est, y = estimates)) +
    geom_point(size = 3) +  # Add points for estimates
    geom_errorbarh(aes(xmin = lower, xmax = upper), height = 0) +  # Horizontal error bars
    labs(x = "Estimate", y = NULL) +  # Axis labels and title
    theme_minimal() +  # Minimal theme for a clean look
    theme(
        axis.text.y = element_text(size = 12),  # Adjust Y-axis text size
        axis.title.x = element_text(size = 14),  # Adjust X-axis title size
    )

The first one is the one that you showed, and the latter one is made with ggplot
image

image

@TuomasBorman
Copy link
Contributor

Also, I found that this "metadata" is rather mess. Does it come directly from mediate function? If it does not, then we should make it neater. For instance, we could combine it to table. Currently, it is hard to find relevant information as it is just a long list

dat <- attr(med_df, "metadata")[[1]]

@antagomir
Copy link
Member

If ComplexHeatmap is not added as a dependency, should we still consider showing an example with that as well, in OMA for instance?

@TuomasBorman
Copy link
Contributor

If ComplexHeatmap is not added as a dependency, should we still consider showing an example with that as well, in OMA for instance?

Yes, and we have currently and example on that, for instance: https://microbiome.github.io/OMA/docs/devel/pages/correlation.html#association-between-taxa-and-sample-metadata

@RiboRings
Copy link
Member Author

RiboRings commented Jan 20, 2025

Also, I found that this "metadata" is rather mess. Does it come directly from mediate function?

Thanks for the comments! I agree it is better to use standard ggplot2.

The metadata is a list of model outputs for each mediator used, as someone might want to have access to the original mediation model output.

I can check if that can be organised in a table.

@RiboRings
Copy link
Member Author

In microbiome/mia#678 the messy metadata is polished. If that looks good, I will then proceed with plotMediation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants