Microbiome analysis in R - Mediation Analysis

Author

Max Qiu, Chia Sin Liew, Jean-Jack Riethoven,
Binformatics Core Research Facility (BCRF)
Nebraska Center for Biotechnology,
University of Nebraska-Lincoln

Published

October 17, 2023

Note

For a free consultation, please visit our consultation scheduler!

Workshop and Data Overview

This installment of the microbiome analysis is mediation analysis. For this session, we will be using the full phyloseq object from Salomon, JD et al., 2023 so we have more covariants to work with. The data has two main covariates, group (CNT and CPB representing control samples and cardiopulmonary bypass samples) and time (pre and post representing pre-operation and post-operation time points).

Getting Ready

Load R packages we will be using for today’s workshop.

suppressMessages({
        library("tidyverse")
        library("phyloseq")
        library("ggsci")
        library("vegan")
        library("energy")
        library("ade4")
        library("grid")
        library("gridExtra")
})
no_of_cores = 1
source("http://raw.githubusercontent.com/bashirhamidi/MODIMA/master/modima.R", max.deparse.length=100000)
source("https://raw.githubusercontent.com/alekseyenko/WdStar/master/Wd.R", max.deparse.length=100000)

Copy all needed files to $WORK, create the 16S_Workshop directory if it does not exist, and set it as working directory using commands below.

system("cp -Ru /work/riethoven/jeanjack/16S_Workshop $WORK/", intern=TRUE)
setwd(file.path(Sys.getenv("WORK"), "16S_Workshop"))

Load the phyloseq object constructed from whole dataset of Salomon, JD et al., 2023.

load("./intermediate/SalomonJD_et_al_covariates.rda")

Inspect objects loaded into your environment.

ls()
 [1] "create_dt"                         "dist.cohen.d"                     
 [3] "dist.group.sigma2"                 "dist.sigma2"                      
 [5] "dist.ss2"                          "generic.distance.permutation.test"
 [7] "is.dist"                           "modima"                           
 [9] "MODIMAstat"                        "no_of_cores"                      
[11] "permuteDist"                       "ps"                               
[13] "spdcov.test"                       "subset.distance"                  
[15] "trueunicode"                       "Tw2"                              
[17] "Tw2.posthoc.1vsAll.tests"          "Tw2.posthoc.tests"                
[19] "Tw2.test"                          "WdS"                              
[21] "WdS.test"                         
ps
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 4578 taxa and 24 samples ]
sample_data() Sample Data:       [ 24 samples by 17 sample variables ]
tax_table()   Taxonomy Table:    [ 4578 taxa by 7 taxonomic ranks ]
phy_tree()    Phylogenetic Tree: [ 4578 tips and 4576 internal nodes ]
refseq()      DNAStringSet:      [ 4578 reference sequences ]

To rearrange the levels of factor or to identify which level of a factor is reference

Note

Levels of a factor are ordered alphabetically by default, which means for group factor with two levels CNT and CPB, CNT is default reference level; for time factor with two levels pre and post, post is default reference level.

sample_meta = sample_data(ps) %>%
        dplyr::as_tibble() %>%
        dplyr::mutate(time = factor(time)) %>% # this step make `time` a factor so we can rearrange its levels
        within(., time <- relevel(time, ref = "pre")) %>% # this step set `pre` as ref level
        as.data.frame()

rownames(sample_meta) = sample_meta$sample 
sample_data(ps) = data.frame(sample_meta)

Inspect and Filter phyloseq object

A few screening and filtering steps before final phyloseq object is ready for abundance visualization and diversity analysis.

  • Step 1, remove Archaea, unknowns, chloroplasts;
ps.clean <- subset_taxa(ps, Kingdom == "Bacteria") %>%
        subset_taxa(!is.na(Phylum)) %>%
        subset_taxa(!Class %in% c("Chloroplast")) %>%
        subset_taxa(!Family %in% c("mitochondria"))
ps.clean
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 4541 taxa and 24 samples ]
sample_data() Sample Data:       [ 24 samples by 17 sample variables ]
tax_table()   Taxonomy Table:    [ 4541 taxa by 7 taxonomic ranks ]
phy_tree()    Phylogenetic Tree: [ 4541 tips and 4539 internal nodes ]
refseq()      DNAStringSet:      [ 4541 reference sequences ]
Note

We are only interested in bacteria. Removed 37 taxa.

  • Step 2, filter taxa (ASV) with low abundance (< 2);
ps.clean.p0 <- filter_taxa(ps.clean, function (x) {sum(x > 0) >= 2}, prune=TRUE)
ps.clean.p0
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 3564 taxa and 24 samples ]
sample_data() Sample Data:       [ 24 samples by 17 sample variables ]
tax_table()   Taxonomy Table:    [ 3564 taxa by 7 taxonomic ranks ]
phy_tree()    Phylogenetic Tree: [ 3564 tips and 3562 internal nodes ]
refseq()      DNAStringSet:      [ 3564 reference sequences ]
Note

Abundance is the sum of the number of times an ASV is observed across all samples. Removed 977 taxa.

Complete metadata are shown below:

sample_data(ps.clean.p0)

(Causal) Mediation Analysis

In statistics, mediation analysis tests a hypothetical causal chain where one variable X (Exposure) affects a second variable M (Mediator) and, in turn, that variable affects a third variable Y (Outcome). Mediators describe the how or why of a (typically well-established) relationship between two other variables (Exposure and Outcome), as they often describe the process through which an effect occurs (an indirect effect). For instance, people with higher incomes tend to live longer but this effect is explained by the mediating influence of having access to better health care.

Mediation analysis is not new. This is an established branch of statistics and there are tutorials and r packages/tutorials available for guidance. Good examples are here and here.

Figure 1 A scenario with a single mediator between exposure and outcome.

Many important exposure–outcome relationships, such as diet and weight, can be influenced by mediators, such as the gut microbiome. Understanding the role of the mediators, i.e., gut microbiome, is important in refining cause–effect theories and discovering additional medical interventions (e.g., probiotics, prebiotics). (This is just one scenario where one treat microbiome as mediator.) In our example, mediation analysis can be performed to further understand the role the microbiome played as a mediator for outcomes such as measured biomarkers (SCFA or EBD), using group as the exposure.

However, the application of mediation analysis in the analysis of microbiome data, where the effect of a treatment on an outcome is transmitted through perturbing the microbial communities or compositional mediators, is not straightforward. The compositional (i.e., each relative abundance is a non-negative value [0,1), which adds up to 1) and high-dimensional nature of microbiome data makes the standard mediation analysis not directly applicable to our setting.

Figure 2 High-dimensional mediation model for the microbiome data

Recent years, there has been many implementation of multivariate mediation analysis for microbiome data analysis. A few examples:

  • MODIMA: distance-based approach; accommodate binary outcomes; allow multivariate exposures and outcomes; does not allow adjustment of confounding covariates; test mediation effect at community level; permutation-based testing provide empirical p-value; good documentation, better in visualization and graphics.

  • ccmm (cmmb, the extension for binary outcomes): linear model-based approach; use pseudo-counts when performing log-ratio transformation to handle high-dimensionality; does not allow multivariate exposures or outcomes; test mediation effect at both community and taxon levels; allow adjustment of confounding covariates; no error rate control.

  • LDM: linear model-based approach; allow multivariate exposures and outcomes; allow adjustment of confounding covariates; test mediation effect at community and taxon level; control FDR; better in documentation, updates on a regular basis for the best developer support.

Mediation analysis using MODIMA

MODIMA (Multivariate Omnibus Distance Mediation Analysis) used distance correlation and partial distance correlation from energy statistics (E-statistics), expresses the relationships among exposure, mediator and outcome in terms of Pearson correlations. Thus, for a significant effect of the exposure \(X\) on the outcome \(Y\) to exist, the correlation between the two has to be non-zero. Furthermore, if the relationship is in fact mediated by \(M\), both the correlation between exposure and the mediator, and the conditional correlation of the mediator and the outcome, given the exposure, should both be non-zero.

The goal of MODIMA is to explain the role of microbiome as a mediator just like a univariate mediator would, therefore it only provide p-value for a global test of community-level mediation.

Below we demonstrate mediation analysis using MODIMA, testing hypotheses where group as exposure, and one of EBD markers or SCFA as outcome.

# automation
outcome = colnames(sample_meta)[c(7:17)]
exposure = colnames(sample_meta)[6]

combo = expand.grid(exposure, outcome) %>%
        mutate(Var3 = paste(Var1, Var2, sep = "-") )

combo
    Var1                      Var2                            Var3
1  group                  claudin2                  group-claudin2
2  group                  claudin3                  group-claudin3
3  group                     fabp2                     group-fabp2
4  group            propionic_acid            group-propionic_acid
5  group           isobutyric_acid           group-isobutyric_acid
6  group              butyric_acid              group-butyric_acid
7  group    x2_methyl_butyric_acid    group-x2_methyl_butyric_acid
8  group           isovaleric_acid           group-isovaleric_acid
9  group capric_and_isocapric_acid group-capric_and_isocapric_acid
10 group               acetic_acid               group-acetic_acid
11 group              valeric_acid              group-valeric_acid
combo_pair = combo$Var3[[1]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.0050081, p-value = 0.635
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.20524104                       0.24956996 
        mediator–response bcdcor mediator–response–exposure pdcor 
                      0.02809610                      -0.02440105 
combo_pair = combo$Var3[[2]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.02715, p-value = 0.993
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.20524104                       0.02859924 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.12354644                      -0.13228530 
combo_pair = combo$Var3[[3]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.011443, p-value = 0.794
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.18673848                       0.11706244 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.03792873                      -0.06128066 
combo_pair = combo$Var3[[4]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.0065083, p-value = 0.719
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.17804005                       0.04277599 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.02832224                      -0.03655504 
combo_pair = combo$Var3[[5]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.014699, p-value = 0.925
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.17804005                       0.02123408 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.07744203                      -0.08255990 
combo_pair = combo$Var3[[6]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.012599, p-value = 0.894
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.17804005                       0.03175676 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.06394358                      -0.07076324 
combo_pair = combo$Var3[[7]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.008712, p-value = 0.777
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.17804005                      -0.04747771 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.05654987                      -0.04893303 
combo_pair = combo$Var3[[8]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.0063363, p-value = 0.701
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.17804005                       0.09735320 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.01752163                      -0.03558935 
combo_pair = combo$Var3[[9]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.01323, p-value = 0.928
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.17804005                      -0.04261539 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.08064149                      -0.07430787 
combo_pair = combo$Var3[[10]]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: response
response = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, response, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  response 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = -0.021254, p-value = 0.993
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.17804005                      -0.01733401 
        mediator–response bcdcor mediator–response–exposure pdcor 
                     -0.12053914                      -0.11937792 
combo_pair = combo$Var3[11]

covariate = unlist(str_split(combo_pair, pattern = "-"))
var1 = covariate[1] # exposure
var2 = covariate[2] # outcome

#subset phyloseq object
ps.mediation = subset_samples(ps.clean.p0, !is.na( sample_data(ps.clean.p0)[[var2]] ))

# compute distance: mediator
mediator = phyloseq::distance(ps.mediation, method = "jsd") #"jsd", "unifrac", "wunifrac", "bray", "jaccard", "dpcoa"

# compute distance: exposure
exposure = dist(cbind(factor(sample_data(ps.mediation)[[var1]])))

# compute distance: outcome
outcome = dist(sample_data(ps.mediation)[[var2]])

set.seed(42)
res = modima(exposure, mediator, outcome, nrep=999)
res

    MODIMA: a Method for Multivariate Omnibus Distance Mediation Analysis

data:  exposure =  exposure 
       mediator =  mediator 
       response =  outcome 
number of permutations= 1000 
sample estimates are 
    -bias-corrected distance correlation (energy::bcdcor) of indicated pairs and 
    -partial distance correlation (energy::pdcor) of indicated triple
MODIMA stat = 0.023183, p-value = 0.043
sample estimates:
        exposure–mediator bcdcor         exposure–response bcdcor 
                      0.18069365                       0.07256802 
        mediator–response bcdcor mediator–response–exposure pdcor 
                      0.13897056                       0.12830262 
Note

Mediation analysis by MODIMA provide 4-panel graphs:

  • A shows association or lack of association between exposure and outcome; wilcoxon or t test

  • B shows association or lack of association between exposure and microbiome; PERMANOVA or Multivariate Welch ANOVA

  • C and D show association or lack of association of microbiome PCo1 and PCo2 and outcome, removing any effect of exposure (honestly a bit hard to understand, but please refer to this example in package repo)

References and Further Readings

  1. Chapter 14: Mediation and Moderation
  2. Introduction to Mediation Analysis
  3. MODIMA
  4. ccmm
  5. cmmb
  6. LDM

Reproducibility

R session information:

sessionInfo()
R version 4.2.3 (2023-03-15 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] gridExtra_2.3   ade4_1.7-22     energy_1.7-11   vegan_2.6-4    
 [5] lattice_0.20-45 permute_0.9-7   ggsci_3.0.0     phyloseq_1.42.0
 [9] lubridate_1.9.2 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.1    
[13] purrr_1.0.1     readr_2.1.4     tidyr_1.3.0     tibble_3.2.1   
[17] ggplot2_3.4.2   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] nlme_3.1-162           bitops_1.0-7           GenomeInfoDb_1.34.9   
 [4] tools_4.2.3            bslib_0.4.2            utf8_1.2.3            
 [7] R6_2.5.1               DT_0.27                DBI_1.1.3             
[10] BiocGenerics_0.44.0    mgcv_1.8-42            colorspace_2.1-0      
[13] rhdf5filters_1.10.1    withr_2.5.0            tidyselect_1.2.0      
[16] compiler_4.2.3         cli_3.6.1              Biobase_2.58.0        
[19] sass_0.4.6             scales_1.2.1           digest_0.6.31         
[22] rmarkdown_2.21         XVector_0.38.0         pkgconfig_2.0.3       
[25] htmltools_0.5.5        fastmap_1.1.1          htmlwidgets_1.6.2     
[28] rlang_1.1.0            rstudioapi_0.14        jquerylib_0.1.4       
[31] generics_0.1.3         jsonlite_1.8.4         crosstalk_1.2.0       
[34] RCurl_1.98-1.12        magrittr_2.0.3         GenomeInfoDbData_1.2.9
[37] biomformat_1.26.0      Matrix_1.5-3           Rcpp_1.0.10           
[40] munsell_0.5.0          S4Vectors_0.36.2       Rhdf5lib_1.20.0       
[43] fansi_1.0.4            ape_5.7-1              lifecycle_1.0.3       
[46] stringi_1.7.12         yaml_2.3.7             MASS_7.3-58.2         
[49] zlibbioc_1.44.0        rhdf5_2.42.1           plyr_1.8.8            
[52] parallel_4.2.3         crayon_1.5.2           Biostrings_2.66.0     
[55] splines_4.2.3          multtest_2.54.0        hms_1.1.3             
[58] knitr_1.42             pillar_1.9.0           igraph_1.4.2          
[61] boot_1.3-28.1          reshape2_1.4.4         codetools_0.2-19      
[64] stats4_4.2.3           glue_1.6.2             evaluate_0.21         
[67] data.table_1.14.8      vctrs_0.6.1            tzdb_0.4.0            
[70] foreach_1.5.2          gtable_0.3.3           cachem_1.0.8          
[73] xfun_0.38              survival_3.5-3         gsl_2.1-8             
[76] iterators_1.0.14       IRanges_2.32.0         cluster_2.1.4         
[79] timechange_0.2.0       ellipsis_0.3.2