Mediation and Moderation Analysis: Zero to Hero Tutorial

This comprehensive tutorial takes you from the foundational concepts of Mediation and Moderation Analysis all the way through advanced conditional process models, estimation, evaluation, and practical usage within the DataStatPro application. Whether you are encountering these methods for the first time or looking to deepen your understanding of process analysis, this guide builds your knowledge systematically from the ground up.

Prerequisites and Background Concepts
What is Mediation and Moderation Analysis?
The Mathematics Behind Mediation and Moderation
Assumptions of Mediation and Moderation Analysis
Types of Mediation and Moderation Models
Using the Mediation and Moderation Component
Mediation Analysis
Moderation Analysis
Conditional Process Analysis
Model Fit and Evaluation
Advanced Topics
Worked Examples
Common Mistakes and How to Avoid Them
Troubleshooting
Quick Reference Cheat Sheet

1. Prerequisites and Background Concepts

Before diving into mediation and moderation analysis, it is essential to be comfortable with the following foundational statistical concepts. Each is briefly reviewed below.

1.1 Simple and Multiple Linear Regression

Simple linear regression models the relationship between one predictor $X$ and an outcome $Y$ :

$Y = b_0 + b_1 X + e$

Where:

$b_0$ is the intercept — the expected value of $Y$ when $X = 0$ .
$b_1$ is the slope — the expected change in $Y$ for a one-unit increase in $X$ .
$e$ is the residual error — the part of $Y$ not explained by $X$ .

Multiple linear regression extends this to include several predictors:

$Y = b_0 + b_1 X_1 + b_2 X_2 + \dots + b_k X_k + e$

Each coefficient $b_j$ represents the partial effect of $X_j$ on $Y$ , holding all other predictors constant. Mediation and moderation analyses are built entirely on systems of linear regression equations, so a solid understanding of regression is essential.

1.2 Standardised vs. Unstandardised Coefficients

Unstandardised coefficients ( $b$ ) are expressed in the original units of $X$ and $Y$ . They answer: "For a one-unit increase in $X$ , how many units does $Y$ change?"

Standardised coefficients ( $\beta$ ) are expressed in standard deviation units and are obtained by standardising all variables (mean = 0, SD = 1) before analysis:

$\beta_j = b_j \cdot \frac{\sigma_{X_j}}{\sigma_Y}$

In mediation analysis, the indirect effect is most meaningfully expressed in unstandardised units (so it represents a real-world quantity). Standardised coefficients facilitate comparison across studies with different measurement scales.

1.3 Causal Diagrams (Path Diagrams)

A path diagram is a graphical representation of a statistical model showing the hypothesised relationships between variables:

Rectangles (or squares) represent observed variables.
Ovals (or circles) represent latent variables (not used in basic mediation/moderation).
Single-headed arrows ( $\rightarrow$ ) represent directional relationships (regression paths).
Double-headed arrows ( $\leftrightarrow$ ) represent correlations (non-directional associations).

The path coefficient on each arrow is the regression coefficient ( $b$ or $\beta$ ) for that specific relationship.

1.4 The Product of Coefficients

The indirect effect in mediation analysis is computed as the product of two regression coefficients: the effect of $X$ on $M$ (the mediator), and the effect of $M$ on $Y$ (the outcome, controlling for $X$ ):

$\text{Indirect Effect} = a \times b$

This product of coefficients is the foundation of modern mediation analysis. Understanding that an indirect path through a mediator equals the product of the coefficients on that path is the single most important concept in mediation analysis.

1.5 Interaction Terms

An interaction between two variables $X$ and $W$ is represented by their product:

$X \times W$

When this product term is included in a regression model, it captures how the effect of $X$ on $Y$ depends on the level of $W$ . This is the mathematical basis of moderation analysis. The interaction term is NOT the same as the sum or average of $X$ and $W$ — it is specifically their product.

1.6 The Concept of Conditional Effects

A conditional effect is the effect of one variable on another at a specific value of a third variable. For example:

$\theta_{X \rightarrow Y \mid W=w} = b_1 + b_3 \cdot w$

This reads: "The effect of $X$ on $Y$ when $W = w$ equals $b_1 + b_3 \times w$ ." Understanding conditional effects is the key to interpreting moderation results.

1.7 Bootstrapping

Bootstrapping is a resampling method used to construct confidence intervals for statistics whose sampling distributions are unknown or non-normal (such as the product of two regression coefficients $a \times b$ ). The algorithm is:

Draw $B$ bootstrap samples of size $n$ from the original data with replacement.
Compute the statistic of interest (e.g., indirect effect $\hat{a} \times \hat{b}$ ) in each bootstrap sample.
The distribution of the $B$ bootstrap estimates approximates the sampling distribution.
The 95% CI is the 2.5th and 97.5th percentiles of this bootstrap distribution.

Bootstrapping is the gold standard for testing indirect effects in mediation analysis. Typically $B = 5{,}000$ to $B = 10{,}000$ bootstrap samples are used.

2. What is Mediation and Moderation Analysis?

2.1 The Core Questions

Mediation and moderation analysis addresses two complementary questions about how and when relationships between variables exist:

Mediation asks HOW or WHY does $X$ affect $Y$ ?

"Does the effect of stress ( $X$ ) on depression ( $Y$ ) operate through rumination ( $M$ )?"

Moderation asks WHEN or FOR WHOM does $X$ affect $Y$ ?

"Is the effect of stress ( $X$ ) on depression ( $Y$ ) stronger for people with low social support ( $W$ ) compared to people with high social support ( $W$ )?"

Conditional Process Analysis asks HOW AND WHEN simultaneously:

"Does the indirect effect of stress on depression through rumination depend on the level of social support?"

2.2 The Fundamental Distinction

Concept	Question	Mechanism	Variable Type
Mediation	How/Why?	$X \rightarrow M \rightarrow Y$	$M$ is a mediator (intermediate variable)
Moderation	When/For whom?	$W$ changes the $X \rightarrow Y$ relationship	$W$ is a moderator (context variable)
Conditional Process	How AND When?	Indirect effect depends on $W$	Both $M$ (mediator) and $W$ (moderator) present

2.3 Visual Summary of Core Concepts

Mediation (M is on the path from X to Y): X ─────────→ Y ↘ ↗ M (Mediator)

Moderation (W changes the strength/direction of X → Y): W (Moderator) ↓ X ─────────→ Y (effect of X depends on W) Moderated Mediation (Conditional Process): W (Moderator) ↓ X ─────────→ Y ↘ ↗ M (Mediator)

2.4 Real-World Applications

Field	Mediation Example	Moderation Example
Psychology	Exercise → Self-efficacy → Wellbeing	Exercise → Wellbeing (stronger for introverts)
Medicine	Drug → Inflammation reduction → Pain relief	Drug → Pain relief (varies by genetic marker)
Education	Training → Self-regulation → Academic performance	Training → Performance (varies by prior knowledge)
Marketing	Ad exposure → Brand attitude → Purchase intent	Ad exposure → Purchase (stronger for low involvement)
Management	Leadership style → Motivation → Productivity	Leadership → Productivity (varies by culture)
Public Health	Policy → Behaviour change → Health outcome	Policy → Health (varies by socioeconomic status)
Neuroscience	Stress → Cortisol → Memory impairment	Stress → Memory (moderated by amygdala volume)
Social Science	Poverty → Social isolation → Crime	Poverty → Crime (moderated by community cohesion)

2.5 A Brief History

The Baron and Kenny (1986) causal steps approach was the dominant method for mediation analysis for two decades. It required four conditions to be satisfied for mediation to be claimed. However, it has been largely superseded by the product of coefficients approach combined with bootstrapped confidence intervals, advocated by Preacher & Hayes (2004, 2008) and formalised in Hayes' (2013, 2022) PROCESS macro framework — which is the approach implemented in DataStatPro.

3. The Mathematics Behind Mediation and Moderation

3.1 The Simple Mediation Model — Equations

The simplest mediation model involves three variables: a predictor $X$ , a mediator $M$ , and an outcome $Y$ . It requires two regression equations:

Equation 1 — Predicting the mediator from $X$ :

$M = i_M + aX + e_M$

Equation 2 — Predicting the outcome from both $X$ and $M$ :

$Y = i_Y + c'X + bM + e_Y$

Where:

$a$ = effect of $X$ on $M$ (the a-path).
$b$ = effect of $M$ on $Y$ , controlling for $X$ (the b-path).
$c'$ = effect of $X$ on $Y$ , controlling for $M$ (the direct effect of $X$ ).
$i_M$ , $i_Y$ = intercepts.
$e_M$ , $e_Y$ = residual errors.

3.2 The Three Effects in Mediation

The Total Effect ( $c$ ):

The total effect is the effect of $X$ on $Y$ without including $M$ :

$Y = i_Y + cX + e_Y$

$c$ combines the direct and indirect pathways.

The Indirect Effect ( $a \times b$ ):

The indirect effect is the product of the $a$ -path and the $b$ -path:

$\text{Indirect Effect} = a \times b$

It quantifies how much of the effect of $X$ on $Y$ is transmitted through $M$ .

The Direct Effect ( $c'$ ):

The direct effect is the effect of $X$ on $Y$ after accounting for $M$ :

$c' = c - ab$

The Fundamental Identity:

The total effect equals the direct effect plus the indirect effect:

$c = c' + ab$

This decomposition is the cornerstone of mediation analysis. Every unit of the total effect can be attributed to either the direct path ( $c'$ ) or the indirect path through the mediator ( $ab$ ).

3.3 The Proportion Mediated

The proportion mediated (also called the indirect effect ratio) estimates what fraction of the total effect passes through the mediator:

$PM = \frac{ab}{c} = \frac{ab}{c' + ab} = 1 - \frac{c'}{c}$

⚠️ The proportion mediated is unreliable when the total effect $c$ is near zero (even if the indirect effect $ab$ is substantial — a phenomenon called inconsistent mediation). Never rely on the proportion mediated as the primary evidence for mediation.

3.4 Multiple Mediators — Parallel Mediation

With $k$ parallel mediators $M_1, M_2, \dots, M_k$ , the model uses $k + 1$ equations:

Equations for each mediator $j = 1, \dots, k$ :

$M_j = i_{M_j} + a_j X + e_{M_j}$

Outcome equation:

$Y = i_Y + c'X + b_1 M_1 + b_2 M_2 + \dots + b_k M_k + e_Y$

Specific indirect effect through mediator $j$ :

$\text{Indirect}_j = a_j b_j$

Total indirect effect:

$\text{Total Indirect} = \sum_{j=1}^{k} a_j b_j$

Total effect:

$c = c' + \sum_{j=1}^{k} a_j b_j$

Contrasts between indirect effects: The difference between two specific indirect effects can be tested:

$\text{Contrast}_{jk} = a_j b_j - a_k b_k$

A bootstrapped CI for this contrast that excludes zero indicates that mediators $j$ and $k$ transmit significantly different portions of the effect of $X$ to $Y$ .

3.5 Sequential (Serial) Mediation

With two sequential mediators $M_1$ and $M_2$ (where $M_1$ precedes $M_2$ on the causal chain):

Equation 1:

$M_1 = i_{M_1} + a_1 X + e_{M_1}$

Equation 2:

$M_2 = i_{M_2} + a_2 X + d_{21} M_1 + e_{M_2}$

Equation 3:

$Y = i_Y + c'X + b_1 M_1 + b_2 M_2 + e_Y$

Where $d_{21}$ is the effect of $M_1$ on $M_2$ .

The three indirect effects:

Path	Indirect Effect	Interpretation
$X \rightarrow M_1 \rightarrow Y$	$a_1 b_1$	Through $M_1$ only
$X \rightarrow M_2 \rightarrow Y$	$a_2 b_2$	Through $M_2$ only
$X \rightarrow M_1 \rightarrow M_2 \rightarrow Y$	$a_1 d_{21} b_2$	Through both mediators sequentially

Total indirect effect:

$\text{Total Indirect} = a_1 b_1 + a_2 b_2 + a_1 d_{21} b_2$

Total effect identity:

$c = c' + a_1 b_1 + a_2 b_2 + a_1 d_{21} b_2$

3.6 The Simple Moderation (Interaction) Model

The simple moderation model includes the predictor $X$ , the moderator $W$ , and their product:

$Y = b_0 + b_1 X + b_2 W + b_3 (X \times W) + e$

Where:

$b_1$ = effect of $X$ on $Y$ when $W = 0$ (the conditional direct effect of $X$ at $W = 0$ ).
$b_2$ = effect of $W$ on $Y$ when $X = 0$ .
$b_3$ = the interaction effect — how much the effect of $X$ on $Y$ changes for each one-unit increase in $W$ .
$b_0$ = the intercept when both $X = 0$ and $W = 0$ .

The conditional effect of $X$ on $Y$ at a specific value $w$ of $W$ :

$\hat{\theta}_{X \rightarrow Y}(w) = b_1 + b_3 w$

This is the simple slope of $Y$ on $X$ at the value $W = w$ . The moderation hypothesis is tested by examining whether $b_3 \neq 0$ .

3.7 Centering in Moderation Analysis

Mean-centering the predictor and moderator before computing the interaction term is strongly recommended:

$X_c = X - \bar{X}, \quad W_c = W - \bar{W}$

The model becomes:

$Y = b_0 + b_1 X_c + b_2 W_c + b_3 (X_c \times W_c) + e$

Benefits of mean-centering:

Reduces multicollinearity between $X$ , $W$ , and $X \times W$ (without changing the interaction coefficient $b_3$ ).
Makes the intercept $b_0$ interpretable as the predicted value of $Y$ when $X$ and $W$ are at their means.
Makes $b_1$ interpretable as the effect of $X$ on $Y$ at the mean of $W$ (not at $W = 0$ , which may be outside the observed range).

⚠️ Centering does NOT change $b_3$ (the interaction coefficient) or the model's $R^2$ . It only changes the interpretation of $b_1$ and $b_2$ and reduces multicollinearity.

3.8 Probing the Interaction: Simple Slopes

After establishing a significant interaction ( $b_3 \neq 0$ ), the interaction must be probed to understand its nature. The simple slopes method computes the conditional effect of $X$ on $Y$ at representative values of $W$ :

Standard values of $W$ (Pick-a-point approach):

Low: $W = \bar{W} - 1\text{SD}_W$ (one SD below the mean).
Medium: $W = \bar{W}$ (the mean).
High: $W = \bar{W} + 1\text{SD}_W$ (one SD above the mean).

Simple slope at each value $w$ :

$\hat{\theta}_{X \rightarrow Y}(w) = b_1 + b_3 w$

Standard error of the simple slope:

$SE[\hat{\theta}(w)] = \sqrt{\widehat{\text{Var}}(b_1) + 2w \cdot \widehat{\text{Cov}}(b_1, b_3) + w^2 \cdot \widehat{\text{Var}}(b_3)}$

t-statistic for the simple slope:

$t = \frac{\hat{\theta}(w)}{SE[\hat{\theta}(w)]}$

With $df = n - k - 1$ where $k$ is the number of predictors.

Johnson-Neyman (Floodlight) Method: Rather than evaluating at specific values of $W$ , the Johnson-Neyman technique finds the exact value(s) of $W$ at which the simple slope transitions from significant to non-significant (the region of significance):

$w^* = \frac{-b_1 b_3 \pm t_{\alpha/2} \sqrt{b_3^2 \widehat{\text{Var}}(b_1) - (t_{\alpha/2}^2 - b_1^2)\widehat{\text{Var}}(b_3) + 2b_1 b_3 \widehat{\text{Cov}}(b_1, b_3)}}{b_3^2 - t_{\alpha/2}^2 \widehat{\text{Var}}(b_3)}$

The region of significance is the range of $W$ values where the simple slope of $X$ on $Y$ is statistically significant ( $p < \alpha$ ).

3.9 The Conditional Process Model — Moderated Mediation

The moderated mediation model combines mediation and moderation. The indirect effect $a \times b$ becomes a function of the moderator $W$ :

General form (moderator on the $a$ -path):

$M = i_M + a_1 X + a_2 W + a_3 (X \times W) + e_M$

$Y = i_Y + c'X + bM + e_Y$

The conditional indirect effect at value $w$ of $W$ :

$\text{Indirect Effect}(w) = (a_1 + a_3 w) \times b$

This is a linear function of $W$ . The index of moderated mediation (Hayes, 2015) is $a_3 b$ :

If the CI for $a_3 b$ excludes zero, the indirect effect is significantly moderated.

General form (moderator on the $b$ -path):

$M = i_M + a X + e_M$

$Y = i_Y + c'X + b_1 M + b_2 W + b_3 (M \times W) + e_Y$

The conditional indirect effect at value $w$ of $W$ :

$\text{Indirect Effect}(w) = a \times (b_1 + b_3 w)$

Index of moderated mediation: $a \times b_3$

4. Assumptions of Mediation and Moderation Analysis

4.1 Causal Ordering (Temporal Precedence)

The most fundamental assumption of mediation analysis is that the hypothesised causal ordering is correct: $X$ must precede $M$ in time, and $M$ must precede $Y$ in time.

Why it matters: Mediation analysis estimates causal indirect effects. If the temporal order is wrong (e.g., $M$ actually precedes $X$ ), the computed $a \times b$ product does not represent a causal indirect effect — it is merely a partial correlation.

How to check:

Use longitudinal data where $X$ , $M$ , and $Y$ are measured at different time points.
In experimental designs, randomise $X$ to ensure it causally precedes $M$ .
In cross-sectional data, rely on theory and prior literature to justify the causal order.

⚠️ Cross-sectional data alone cannot establish causal mediation. Mediation from cross-sectional data should always be described as "consistent with a mediation hypothesis" rather than "demonstrating causal mediation."

4.2 No Unmeasured Confounding

Both the $X \rightarrow M$ relationship and the $M \rightarrow Y$ relationship must be free of unmeasured confounders — variables that affect both the predictor/mediator and the outcome simultaneously.

Why it matters: If an unmeasured variable $C$ affects both $M$ and $Y$ , the $b$ -path estimate (effect of $M$ on $Y$ ) is biased, and the indirect effect $ab$ does not represent a causal effect.

How to address:

Measure and control for known confounders.
Use experimental manipulation of $M$ when possible.
Conduct sensitivity analysis (e.g., Imai et al.'s sensitivity parameter $\rho$ ) to assess how large an unmeasured confounder would need to be to nullify the indirect effect.

4.3 Linearity

The model assumes that all relationships (paths) are linear. Non-linear relationships (e.g., curvilinear effects, threshold effects) are not captured by standard mediation and moderation models.

How to check:

Plot residuals against fitted values (should show no pattern).
Add polynomial terms (e.g., $X^2$ ) to the model and test their significance.
Use partial regression plots (added variable plots) for each predictor.

4.4 Normally Distributed Residuals

The residuals $e_M$ and $e_Y$ should be approximately normally distributed. This is required for the validity of t-tests and F-tests of regression coefficients.

How to check:

Q-Q plots of residuals.
Shapiro-Wilk test of residuals.
Histograms of residuals.

Remedy when violated: Use bootstrapped confidence intervals for indirect effects — bootstrapping does not require normality of the indirect effect and is robust to non-normal residuals.

4.5 Homoscedasticity

The variance of the residuals should be constant across all levels of the predictors (homoscedasticity). Heteroscedasticity (non-constant variance) inflates or deflates standard errors, affecting the validity of significance tests.

How to check:

Breusch-Pagan test.
White test.
Plot residuals vs. fitted values: look for a fan shape.

Remedy when violated: Use heteroscedasticity-consistent (HC) standard errors (White's robust SEs). DataStatPro implements HC3 robust standard errors as an option.

4.6 Independence of Observations

Each observation must be independent of all others. Violations occur with clustered data (e.g., students within schools), repeated measures (same subject at multiple times), or dyadic data (e.g., couples).

How to address:

Clustered data: Use multilevel mediation/moderation models.
Longitudinal/repeated measures: Use cross-lagged panel models or cross-lagged mediation.
Dyadic data: Use Actor-Partner Interdependence Model (APIM) with mediation extensions.

4.7 No Perfect Multicollinearity

Predictors should not be perfectly correlated with each other. In moderation analysis, the interaction term $X \times W$ is often highly correlated with $X$ and $W$ individually (multicollinearity), which can inflate standard errors and destabilise coefficient estimates.

How to check:

Variance Inflation Factor (VIF): $\text{VIF} > 10$ is typically a concern.
Condition index from eigenvalue decomposition.

Remedy: Mean-center $X$ and $W$ before forming the interaction term $X \times W$ (as described in Section 3.7). This substantially reduces multicollinearity without changing $b_3$ .

4.8 Adequate Sample Size and Statistical Power

Mediation analysis (especially bootstrapped indirect effects) requires sufficient sample size for stable and reproducible results:

Model Type	Minimum $n$	Recommended $n$
Simple mediation	100	$\geq 200$
Parallel mediation (2 mediators)	150	$\geq 300$
Sequential mediation	200	$\geq 400$
Simple moderation	100	$\geq 200$
Moderated mediation	200	$\geq 400$
Complex conditional process	300	$\geq 500$

💡 Use Monte Carlo power analysis (e.g., via the pwrss or pwr2ppl packages in R, or the DataStatPro power module) to determine the required sample size for your specific model and effect size prior to data collection.

5. Types of Mediation and Moderation Models

5.1 Mediation Models

Model	Structure	Key Feature
Simple Mediation	$X \rightarrow M \rightarrow Y$	One mediator; single indirect path
Parallel Multiple Mediation	$X \rightarrow \{M_1, M_2, \dots, M_k\} \rightarrow Y$	Multiple mediators operating simultaneously; not causally connected
Sequential (Serial) Mediation	$X \rightarrow M_1 \rightarrow M_2 \rightarrow Y$	Mediators are causally chained; $M_1$ affects $M_2$
Multiple Sequential Mediation	$X \rightarrow M_1 \rightarrow M_2 \rightarrow M_3 \rightarrow Y$	Three or more sequential mediators
Partial Mediation	$X \rightarrow M \rightarrow Y$ AND $X \rightarrow Y$	Both direct and indirect effects are non-zero
Full Mediation	$X \rightarrow M \rightarrow Y$ (direct = 0)	All of $X$ 's effect on $Y$ operates through $M$
Inconsistent Mediation	$ab > 0$ but $c < 0$ (or vice versa)	Direct and indirect effects have opposite signs

5.2 Moderation Models

Model	Structure	Key Feature
Simple Moderation	$W$ moderates $X \rightarrow Y$	One moderator; one interaction term
Multiple Moderation	$W_1$ and $W_2$ both moderate $X \rightarrow Y$	Two moderators tested simultaneously
Three-Way (Moderated Moderation)	$V$ moderates the $W \times X$ interaction	The interaction itself is moderated
Moderated Regression with Covariates	Moderation with control variables	Interaction tested after controlling for confounders

5.3 Conditional Process Models (Moderated Mediation / Mediated Moderation)

Model	Where Moderation Occurs	Hayes PROCESS Model
Moderated Mediation (First-Stage)	$W$ moderates the $a$ -path ( $X \rightarrow M$ )	Model 7
Moderated Mediation (Second-Stage)	$W$ moderates the $b$ -path ( $M \rightarrow Y$ )	Model 14
Moderated Mediation (Both Stages)	$W$ moderates both $a$ and $b$ paths	Model 58
Moderated Mediation (Direct Path)	$W$ moderates the direct effect $c'$	Model 5
Mediated Moderation	$M$ mediates the $X \times W \rightarrow Y$ interaction	Model 8
Sequential Moderated Mediation	$W$ moderates a path in a sequential mediation model	Models 83, 84, 85
Dual Moderated Mediation	Two separate moderators on two different paths	Model 21

5.4 Key Terminology Clarification

Term	Definition	Notes
Indirect Effect	$a \times b$ (product of path coefficients through $M$ )	The core quantity in mediation
Direct Effect	$c'$ (effect of $X$ on $Y$ after accounting for $M$ )	Residual effect not through $M$
Total Effect	$c = c' + ab$	Sum of direct and all indirect effects
Conditional Direct Effect	$c'(w) = b_1 + b_3 w$	Direct effect that varies with $W$
Conditional Indirect Effect	$a(w) \times b$ or $a \times b(w)$	Indirect effect that varies with $W$
Index of Moderated Mediation	Coefficient linking $W$ to the indirect effect	$a_3 b$ or $a b_3$
Region of Significance	Range of $W$ where simple slope is $p < .05$	From Johnson-Neyman method
Floodlight Analysis	Johnson-Neyman visualisation across all values of $W$	Shows full region of significance

6. Using the Mediation and Moderation Component

The Mediation and Moderation component in DataStatPro provides a complete workflow for specifying, estimating, evaluating, and visualising all mediation, moderation, and conditional process models.

Step-by-Step Guide

Step 1 — Select Dataset

Choose the dataset from the "Dataset" dropdown. Ensure:

All variables (predictor, mediator(s), moderator(s), outcome, covariates) are in separate columns.
All variables are numeric (continuous or binary coded 0/1).
The dataset has been screened for missing data and outliers.

💡 Tip: Run descriptive statistics (means, SDs, skewness) before mediation/moderation analysis. Extreme skewness ( $|z| > 2$ ) may affect the normality of residuals and make bootstrapped CIs preferable.

Step 2 — Select Analysis Type

Choose from the "Analysis Type" dropdown:

Mediation Analysis — for pure mediation models (simple, parallel, sequential).
Moderation Analysis — for pure moderation models (simple, multiple, three-way).
Conditional Process Analysis — for combined moderated mediation / mediated moderation models.

Step 3 — Specify the Model Structure

Predictor (X): Select the independent variable.
Outcome (Y): Select the dependent variable.
Mediator(s) (M): Select one or more mediator variables (for mediation models). Specify the type of mediation:
- Parallel (mediators are not causally connected).
- Sequential (specify the causal order of mediators).
Moderator(s) (W, V): Select one or more moderator variables (for moderation and conditional process models).
Covariates (C): Select any control variables to include in all equations.

⚠️ Important: Specify the model structure based on your theory, NOT based on what produces the best statistics. Choosing the model after seeing the data is a form of HARKing (Hypothesising After Results are Known) and invalidates statistical inference.

Step 4 — Select the Conditional Process Model Type

If running a Conditional Process Analysis, select the specific model from the dropdown:

First-Stage Moderated Mediation — $W$ moderates the $X \rightarrow M$ path.
Second-Stage Moderated Mediation — $W$ moderates the $M \rightarrow Y$ path.
Both-Stages Moderated Mediation — $W$ moderates both $a$ and $b$ paths.
Mediated Moderation — $M$ mediates the $X \times W \rightarrow Y$ effect.
Sequential Moderated Mediation — sequential mediators with moderation.

Step 5 — Select Probing Options (for Moderation)

Specify how to probe significant interactions:

Spotlight Analysis (Pick-a-Point): Compute simple slopes at Low ( $-1\text{SD}$ ), Mean, and High ( $+1\text{SD}$ ) values of $W$ .
Johnson-Neyman (Floodlight) Analysis: Compute the exact region(s) of significance across all values of $W$ .
Custom values of $W$ : Enter specific percentiles or theoretically meaningful values.

💡 Tip: Always request both spotlight and Johnson-Neyman analyses. The pick-a-point approach is useful for visualisation; Johnson-Neyman gives a complete picture of where the interaction is and is not significant.

Step 6 — Configure Bootstrap Settings

Number of Bootstrap Samples ( $B$ ): Default = 5,000. Increase to 10,000 for publication.
Confidence Level: Default = 95% (BCa bootstrapped CIs). Change to 90% or 99% as needed.
Bootstrap Method:
- Percentile CI: Simple and widely used.
- Bias-Corrected and Accelerated (BCa) CI: Corrects for bias and skewness in the bootstrap distribution. Recommended for indirect effects.

💡 Recommendation: Use BCa bootstrapped CIs with $B = 10{,}000$ for indirect effects in published work. This is the gold standard for mediation analysis.

Step 7 — Mean-Centering Options

For models involving interaction terms (moderation or conditional process):

Mean-center all continuous predictors and moderators: Recommended — reduces multicollinearity.
Standardise all variables: Alternative — produces fully standardised coefficients.
No centering: Use only when $X = 0$ and $W = 0$ are meaningful reference points.

Step 8 — Display Options

Select which outputs and visualisations to display:

✅ Path diagram with estimated coefficients.
✅ Regression equation summaries for each equation in the model.
✅ Direct, indirect, and total effects table.
✅ Bootstrapped confidence intervals for indirect effects.
✅ Simple slopes table and plot (for moderation).
✅ Johnson-Neyman plot with region of significance.
✅ Interaction plot (Y vs. X at Low/Mean/High values of W).
✅ Conditional indirect effects table (for conditional process models).
✅ Index of moderated mediation with bootstrapped CI.

Step 9 — Run the Analysis

Click "Run Analysis". The application will:

Estimate all regression equations using OLS.
Compute direct, indirect, and total effects.
Generate $B$ bootstrap samples and compute the bootstrap distribution of indirect effects.
Construct BCa bootstrapped confidence intervals for all indirect effects.
Compute simple slopes at all specified values of $W$ .
Compute the Johnson-Neyman region of significance.
Generate all selected visualisations and output tables.

7. Mediation Analysis

7.1 Simple Mediation

Simple mediation tests whether the effect of $X$ on $Y$ is transmitted through a single mediator $M$ . This is the foundational mediation model.

7.1.1 Model Specification

The model requires two regression equations:

Path a equation:

$M = i_M + a X + e_M$

Path b and c' equation:

$Y = i_Y + c'X + bM + e_Y$

Total effect equation (for reference):

$Y = i_Y + cX + e_Y$

The path diagram is: a b X ─────────────→ M ─────────────→ Y ↘ ↗ ──────────── c' ───────────

7.1.2 The Four Conditions (Baron & Kenny, Historical)

For historical context, the Baron-Kenny causal steps approach required:

$X$ significantly predicts $Y$ (total effect $c$ significant).
$X$ significantly predicts $M$ (a-path significant).
$M$ significantly predicts $Y$ controlling for $X$ (b-path significant).
The effect of $X$ on $Y$ is reduced when $M$ is included ( $c' < c$ ).

⚠️ The Baron-Kenny approach is now considered outdated and is NOT recommended. Condition 1 is not required (mediation can exist without a significant total effect — called "inconsistent mediation"). Use bootstrapped indirect effects instead.

7.1.3 Modern Approach: Bootstrapped Indirect Effects

The modern approach tests mediation by constructing a bootstrapped confidence interval for the indirect effect $ab$ :

Estimate $\hat{a}$ and $\hat{b}$ from the data.
Compute $\hat{a} \times \hat{b}$ (point estimate of indirect effect).
Generate $B = 5{,}000$ bootstrap samples.
Compute $\hat{a}^*_b \times \hat{b}^*_b$ in each bootstrap sample $b$ .
Construct the 95% BCa CI from the bootstrap distribution.
Decision: If the 95% CI for $ab$ excludes zero → significant mediation.

7.1.4 Types of Mediation Based on Effect Sizes

Pattern	$c$	$ab$	$c'$	Classification
No mediation	sig	not sig	sig	Direct effect only
Full mediation	sig	sig	not sig	All effect through $M$
Partial mediation	sig	sig	sig (same sign)	Both direct and indirect
Inconsistent mediation	small/not sig	sig	sig (opposite sign)	Suppression; $c' > c$
Competitive mediation	sig	sig	sig (opposite sign)	Direct and indirect cancel

7.1.5 Worked Calculation — Simple Mediation

Suppose $n = 250$ , and we test whether self-efficacy ( $M$ ) mediates the effect of exercise ( $X$ , hours/week) on wellbeing ( $Y$ , 0–100 scale).

Path a equation (predicting self-efficacy from exercise):

$\hat{M} = 4.20 + 0.58 X$ → $\hat{a} = 0.58$ , $SE = 0.11$ , $p < .001$

Path b and c' equation (predicting wellbeing from exercise and self-efficacy):

$\hat{Y} = 21.30 + 0.31 X + 0.44 M$ → $\hat{b} = 0.44$ , $\hat{c'} = 0.31$

Total effect (predicting wellbeing from exercise only):

$\hat{Y} = 23.15 + 0.57 X$ → $\hat{c} = 0.57$

Indirect effect:

$\hat{a} \times \hat{b} = 0.58 \times 0.44 = 0.255$

Verification: $c' + ab = 0.31 + 0.255 = 0.565 \approx c = 0.57$ ✅

Bootstrap 95% BCa CI for indirect effect: $[0.132, 0.403]$

Conclusion: The indirect effect of exercise on wellbeing through self-efficacy is $ab = 0.255$ (95% BCa CI [0.132, 0.403]). Since the CI excludes zero, self-efficacy significantly mediates the exercise-wellbeing relationship. This is partial mediation — the direct effect ( $c' = 0.31$ ) remains significant. Each additional hour of exercise per week is associated with a 0.255-point increase in wellbeing through increased self-efficacy.

7.2 Parallel Multiple Mediation

Parallel multiple mediation tests whether $X$ affects $Y$ through two or more simultaneous mediators that are not causally connected to each other.

7.2.1 Model Specification (Two Mediators)

Path equations:

$M_1 = i_{M_1} + a_1 X + e_{M_1}$

$M_2 = i_{M_2} + a_2 X + e_{M_2}$

$Y = i_Y + c'X + b_1 M_1 + b_2 M_2 + e_Y$

Path diagram:

      a₁         b₁
 ────────→ M₁ ────────↘

X ──────↗ Y ────────→ M₂ ────────↗ a₂ b₂ ↘────────────↗ c'

7.2.2 Effects Decomposition

Effect	Formula	Description
Specific indirect via $M_1$	$a_1 b_1$	Effect through mediator 1
Specific indirect via $M_2$	$a_2 b_2$	Effect through mediator 2
Total indirect	$a_1 b_1 + a_2 b_2$	Combined indirect effect
Direct effect	$c'$	Effect not through any mediator
Total effect	$c' + a_1 b_1 + a_2 b_2$	All pathways combined
Contrast ( $M_1$ vs $M_2$ )	$a_1 b_1 - a_2 b_2$	Difference between indirect effects

7.2.3 Testing Contrasts Between Indirect Effects

A key advantage of parallel mediation over separate simple mediations is the ability to directly compare specific indirect effects. The contrast tests whether mediator $M_1$ transmits significantly MORE or LESS of the effect than $M_2$ :

$C_{12} = a_1 b_1 - a_2 b_2$

A bootstrapped 95% CI for $C_{12}$ that excludes zero indicates that the two indirect effects differ significantly. This allows statements such as: "The indirect effect through self-efficacy was significantly larger than the indirect effect through motivation ( $C_{12} = 0.18$ , 95% BCa CI [0.04, 0.35])."

7.2.4 Why Parallel Mediation is Preferred Over Separate Analyses

Reason	Explanation
Controls for mediators simultaneously	Each $b$ coefficient controls for the other mediators
Avoids inflated Type I error	One model rather than multiple separate tests
Enables direct comparison	Contrasts between specific indirect effects are possible
More powerful	Estimates are more precise when mediators share variance

⚠️ Never run separate simple mediation analyses for each mediator and compare results informally. Always include all mediators simultaneously in a single model to correctly estimate specific indirect effects.

7.3 Sequential (Serial) Mediation

Sequential mediation (also called serial mediation) tests a causal chain hypothesis where $X$ affects $M_1$ , which then affects $M_2$ , which then affects $Y$ . This model makes stronger theoretical claims than parallel mediation.

7.3.1 Model Specification (Two Sequential Mediators)

Equation 1:

$M_1 = i_{M_1} + a_1 X + e_{M_1}$

Equation 2:

$M_2 = i_{M_2} + a_2 X + d_{21} M_1 + e_{M_2}$

Equation 3:

$Y = i_Y + c'X + b_1 M_1 + b_2 M_2 + e_Y$

Path diagram: a₁ d₂₁ b₂ X ────────→ M₁ ───────→ M₂ ───────→ Y ↘─────────↗ ↘──── a₂ ───────→ M₂ ↘──── a₁b₁ ───────→ Y c'

7.3.2 The Three Indirect Effects

Path	Formula	Description
$X \rightarrow M_1 \rightarrow Y$	$a_1 b_1$	Through $M_1$ only, bypassing $M_2$
$X \rightarrow M_2 \rightarrow Y$	$a_2 b_2$	Through $M_2$ only, bypassing $M_1$
$X \rightarrow M_1 \rightarrow M_2 \rightarrow Y$	$a_1 d_{21} b_2$	The full serial pathway
Total indirect	$a_1 b_1 + a_2 b_2 + a_1 d_{21} b_2$	All indirect pathways

7.3.3 When to Use Sequential vs. Parallel Mediation

Feature	Sequential	Parallel
Mediators causally connected ( $M_1 \rightarrow M_2$ )?	Yes	No
Theoretical justification for chain?	Required	Not required
Tests the full causal process?	Yes	Partially
More complex model?	Yes	Simpler
Requires larger sample?	Yes	Less so

💡 Use sequential mediation ONLY when there is strong theoretical (and ideally temporal) justification for a causal chain between mediators. If the causal connection between $M_1$ and $M_2$ is ambiguous, use parallel mediation and add the $M_1 \rightarrow M_2$ path as a sensitivity check.

7.3.4 Extending to Three or More Sequential Mediators

With three sequential mediators $M_1 \rightarrow M_2 \rightarrow M_3$ , the model contains five equations and seven indirect effects (one for each possible sub-path combination), including the full chain $a_1 d_{21} d_{32} b_3$ .

For $k$ sequential mediators, the number of indirect effects is $2^k - 1$ .

⚠️ Models with more than two sequential mediators require very large samples ( $n \geq 500$ ) and should have very strong theoretical justification. The risk of overfitting and non-replication increases substantially with model complexity.

8. Moderation Analysis

8.1 Simple Moderation

Simple moderation tests whether the relationship between $X$ and $Y$ depends on the level of a third variable $W$ (the moderator).

8.1.1 Model Specification

$Y = b_0 + b_1 X + b_2 W + b_3 (X \times W) + e$

The hypothesis of moderation is tested by $H_0: b_3 = 0$ .

A significant $b_3 \neq 0$ indicates that the effect of $X$ on $Y$ varies as a function of $W$ .

8.1.2 Interpreting the Interaction Coefficient

The interaction coefficient $b_3$ tells us:

"For each one-unit increase in $W$ , the effect of $X$ on $Y$ changes by $b_3$ units."

Or equivalently:

"The conditional effect of $X$ on $Y$ is $b_1 + b_3 W$ ."

The sign and magnitude of $b_3$ determine the pattern of moderation:

$b_3$	Pattern	Visualisation
Positive, large	Strong positive moderation	Steeper slope at high $W$
Positive, small	Weak positive moderation	Slightly steeper slope at high $W$
Zero	No moderation	Parallel lines across $W$ values
Negative, small	Weak negative moderation	Slightly flatter slope at high $W$
Negative, large	Strong negative moderation	Reversed/attenuated slope at high $W$

8.1.3 Simple Slopes Analysis (Spotlight Analysis)

After finding a significant interaction, compute simple slopes at three values of $W$ :

Low $W$ : $\hat{\theta}_{XY}(W_L) = b_1 + b_3(M_W - SD_W)$

Mean $W$ : $\hat{\theta}_{XY}(M_W) = b_1 + b_3(M_W)$

High $W$ : $\hat{\theta}_{XY}(W_H) = b_1 + b_3(M_W + SD_W)$

For each simple slope, compute:

The unstandardised coefficient (the simple slope value).
The standard error.
The t-statistic and p-value.
The 95% confidence interval.

8.1.4 Visualising Moderation

The interaction should always be visualised with an interaction plot (also called a moderation plot or floodlight plot):

X-axis: The predictor $X$ (typically at two values: $X \pm 1\text{SD}$ ).
Y-axis: The predicted outcome $\hat{Y}$ .
Lines: One line for each level of $W$ (Low, Mean, High).

Non-parallel lines indicate moderation. Crossing lines indicate that the direction of the $X \rightarrow Y$ relationship reverses across values of $W$ .

8.1.5 Moderator Variable Types

Moderator Type	Example	Analytical Note
Continuous	Age, income, personality score	Mean-centre before creating interaction
Binary (0/1)	Gender, treatment condition	No centering needed; compare two slopes
Multicategorical	Ethnicity (3+ groups)	Use dummy coding; one interaction per dummy

For binary moderators: The interaction test compares the simple slope of $X$ on $Y$ in Group 0 vs. Group 1. The simple slope in Group 0 is $b_1$ (since $W = 0$ ); in Group 1 it is $b_1 + b_3$ (since $W = 1$ ).

8.2 Multiple Moderation

Multiple moderation tests whether two (or more) moderators $W_1$ and $W_2$ independently moderate the $X \rightarrow Y$ relationship in the same model.

8.2.1 Model Specification

$Y = b_0 + b_1 X + b_2 W_1 + b_3 W_2 + b_4 (X \times W_1) + b_5 (X \times W_2) + e$

Where:

$b_4$ = interaction between $X$ and $W_1$ (adjusting for $W_2$ and $X \times W_2$ ).
$b_5$ = interaction between $X$ and $W_2$ (adjusting for $W_1$ and $X \times W_1$ ).

The conditional effect of $X$ on $Y$ given specific values of $W_1 = w_1$ and $W_2 = w_2$ :

$\hat{\theta}(w_1, w_2) = b_1 + b_4 w_1 + b_5 w_2$

8.2.2 Interpreting Multiple Moderation Results

Each interaction coefficient ( $b_4$ and $b_5$ ) is interpreted conditional on the other moderator being held at zero (or at its mean, if centred):

$b_4$ : How the effect of $X$ on $Y$ changes per unit of $W_1$ , at $W_2 = 0$ (or $\bar{W}_2$ ).
$b_5$ : How the effect of $X$ on $Y$ changes per unit of $W_2$ , at $W_1 = 0$ (or $\bar{W}_1$ ).

Simple slopes are probed at all combinations of $W_1$ and $W_2$ levels (e.g., $3 \times 3 = 9$ combinations for Low/Mean/High values of both moderators).

💡 Multiple moderation models require substantially larger samples than simple moderation ( $n \geq 300$ recommended) because two interaction terms consume more degrees of freedom and interactions are typically harder to detect statistically.

8.3 Moderated Moderation (Three-Way Interaction)

Moderated moderation (also called a three-way interaction) tests whether the moderation effect of $W$ on the $X \rightarrow Y$ relationship is itself moderated by a third variable $V$ . In other words: does the strength of the interaction between $X$ and $W$ depend on $V$ ?

8.3.1 Model Specification

$Y = b_0 + b_1 X + b_2 W + b_3 V + b_4 (X \times W) + b_5 (X \times V) + b_6 (W \times V) + b_7 (X \times W \times V) + e$

Where:

$b_7$ is the three-way interaction coefficient — the test of moderated moderation.

The conditional interaction effect (how the $X \times W$ interaction varies with $V$ ):

$\hat{\theta}_{XW}(v) = b_4 + b_7 v$

The conditional simple slope of $X$ on $Y$ at specific values of $W = w$ and $V = v$ :

$\hat{\theta}_{X}(w, v) = b_1 + b_4 w + b_5 v + b_7 wv$

8.3.2 Probing Three-Way Interactions

To probe a significant three-way interaction ( $b_7 \neq 0$ ):

Step 1: Pick values of $V$ (e.g., $V_{Low}$ , $V_{Mean}$ , $V_{High}$ ).

Step 2: At each value of $V$ , compute the two-way interaction between $X$ and $W$ :

$\hat{\theta}_{XW}(v) = b_4 + b_7 v$

Step 3: At each combination of $W$ and $V$ values, compute the simple slope of $X$ :

$\hat{\theta}_X(w, v) = b_1 + b_4 w + b_5 v + b_7 wv$

This produces $3 \times 3 = 9$ simple slopes (at Low/Mean/High of both $W$ and $V$ ), organised into three two-way interaction plots, one for each level of $V$ .

8.3.3 Visualising Three-Way Interactions

A three-way interaction requires three two-way interaction plots arranged side by side:

Left panel: $X \rightarrow Y$ at $V_{Low}$ , with lines for $W_{Low}$ , $W_{Mean}$ , $W_{High}$ .
Centre panel: $X \rightarrow Y$ at $V_{Mean}$ , with lines for $W_{Low}$ , $W_{Mean}$ , $W_{High}$ .
Right panel: $X \rightarrow Y$ at $V_{High}$ , with lines for $W_{Low}$ , $W_{Mean}$ , $W_{High}$ .

The three-way interaction is significant when the pattern of the two-way $X \times W$ interaction visibly changes across the three panels (e.g., the interaction lines are more or less divergent, or the direction of moderation reverses).

⚠️ Three-way interactions require very large samples ( $n \geq 400$ minimum, ideally $n \geq 600$ ) to achieve adequate statistical power. They are also notoriously difficult to replicate. Always interpret with caution and seek replication.

9. Conditional Process Analysis

Conditional process analysis (Hayes, 2013) refers to models that simultaneously incorporate mediation and moderation — the goal is to understand both how (mediation) and when/for whom (moderation) $X$ affects $Y$ .

9.1 Moderated Mediation — First-Stage (W Moderates the a-Path)

In first-stage moderated mediation, the moderator $W$ changes the strength of the $X \rightarrow M$ relationship (the a-path). Different levels of $W$ produce different magnitudes of the effect of $X$ on $M$ , which in turn produces different indirect effects.

9.1.1 Model Equations

Equation 1 (a-path moderated by W):

$M = i_M + a_1 X + a_2 W + a_3 (X \times W) + e_M$

Equation 2 (b-path unmoderated):

$Y = i_Y + c'X + bM + e_Y$

9.1.2 The Conditional Indirect Effect

The indirect effect varies with $W$ because the a-path is moderated:

$\text{Indirect Effect}(w) = (a_1 + a_3 w) \times b$

At specific values of $W$ :

$\text{IE}(W_{Low}) = (a_1 + a_3 w_L) \times b$

$\text{IE}(W_{Mean}) = (a_1 + a_3 w_M) \times b$

$\text{IE}(W_{High}) = (a_1 + a_3 w_H) \times b$

9.1.3 Index of Moderated Mediation

The index of moderated mediation (IMM) is the rate at which the indirect effect changes as $W$ increases by one unit:

$\text{IMM} = a_3 \times b$

If the bootstrapped 95% CI for IMM excludes zero → the indirect effect is significantly moderated by $W$ .
The sign of IMM indicates the direction: positive IMM means the indirect effect increases with $W$ ; negative means it decreases.

9.1.4 Path Diagram

    W
    ↓ (moderates a-path)

X ─────────────→ M ─────────→ Y a₁ + a₃W b ↘────────────────────────↗ c'

9.2 Moderated Mediation — Second-Stage (W Moderates the b-Path)

In second-stage moderated mediation, the moderator $W$ changes the strength of the $M \rightarrow Y$ relationship (the b-path). The mechanism by which $M$ transmits the effect to $Y$ depends on the level of $W$ .

9.2.1 Model Equations

Equation 1 (a-path unmoderated):

$M = i_M + aX + e_M$

Equation 2 (b-path moderated by W):

$Y = i_Y + c'X + b_1 M + b_2 W + b_3 (M \times W) + e_Y$

9.2.2 The Conditional Indirect Effect

The indirect effect varies with $W$ because the b-path is moderated:

$\text{Indirect Effect}(w) = a \times (b_1 + b_3 w)$

9.2.3 Index of Moderated Mediation

$\text{IMM} = a \times b_3$

A bootstrapped 95% CI for $a \times b_3$ that excludes zero indicates significant moderation of the indirect effect.

9.2.4 Path Diagram

                          W
                          ↓ (moderates b-path)

X ──────────────→ M ────────────────────────→ Y a b₁ + b₃W ↘──────────────────────────────────────────↗ c'

9.3 Moderated Mediation — Both Stages (W Moderates Both a- and b-Paths)

The most complex single-moderator model has $W$ moderating both the a-path ( $X \rightarrow M$ ) and the b-path ( $M \rightarrow Y$ ) simultaneously.

9.3.1 Model Equations

Equation 1 (a-path moderated by W):

$M = i_M + a_1 X + a_2 W + a_3 (X \times W) + e_M$

Equation 2 (b-path also moderated by W):

$Y = i_Y + c'X + b_1 M + b_2 W + b_3 (M \times W) + e_Y$

9.3.2 The Conditional Indirect Effect

Both the a-path and the b-path are now functions of $W$ :

$\text{Indirect Effect}(w) = (a_1 + a_3 w) \times (b_1 + b_3 w)$

This is a quadratic function of $W$ — the indirect effect can change non-linearly across values of $W$ .

Expanding:

$\text{IE}(w) = a_1 b_1 + (a_1 b_3 + a_3 b_1)w + a_3 b_3 w^2$

9.3.3 Index of Moderated Mediation

Because the indirect effect is quadratic in $W$ , there is no single IMM. Instead, the conditional indirect effect must be evaluated at multiple values of $W$ and tested with bootstrapped CIs at each value.

A useful summary is the joint significance of $a_3$ and $b_3$ — if both are significant, moderation at both stages is supported. However, always report the full conditional indirect effect table with bootstrapped CIs.

9.3.4 Path Diagram

    W
    ↓ (moderates both a and b paths)

X ──────────────────→ M ──────────────────→ Y a₁ + a₃W b₁ + b₃W ↘─────────────────────────────────────────↗ c'

9.4 Mediated Moderation

Mediated moderation is conceptually the inverse of moderated mediation. Instead of asking "does the indirect effect depend on $W$ ?", it asks: "does the effect of the interaction $X \times W$ on $Y$ operate through a mediator $M$ ?"

9.4.1 Model Equations

Equation 1 (mediating the interaction):

$M = i_M + a_1 X + a_2 W + a_3 (X \times W) + e_M$

Equation 2:

$Y = i_Y + dM + e_Y$

9.4.2 The Indirect Effect of $X \times W$ Through $M$

The indirect effect of the interaction on $Y$ through $M$ is:

$\text{Indirect Effect of } X \times W = a_3 \times d$

This answers: "How much of the interaction effect of $X$ and $W$ on $Y$ is transmitted through the mediator $M$ ?"

9.4.3 Mediated Moderation vs. Moderated Mediation: An Important Distinction

Feature	Moderated Mediation	Mediated Moderation
Primary question	Does the indirect effect ( $ab$ ) vary with $W$ ?	Does $M$ explain why $X \times W$ affects $Y$ ?
Focus	Conditional indirect effect of $X$ on $Y$	Mechanism of an interaction effect
Centrepiece	The indirect effect	The interaction ( $X \times W$ )
Modern preference	More commonly used	Less commonly used
When to use	Theory specifies how $M$ connects $X$ and $Y$	Theory specifies why an interaction exists

💡 Most modern researchers prefer to frame their models as moderated mediation rather than mediated moderation, because moderated mediation makes clearer and more direct theoretical statements about conditional processes. If your theory is about an interaction effect and why it exists, mediated moderation is appropriate.

9.4.4 Path Diagram

X × W ──── a₃ ────→ M ──── d ────→ Y
  ↓

(interaction term)

9.5 Sequential Moderated Mediation

Sequential moderated mediation combines sequential (serial) mediation with moderation — a moderator $W$ affects one or more paths within a two-mediator sequential model.

9.5.1 Model Equations (W Moderates the First a-Path)

Equation 1 (a₁-path moderated by W):

$M_1 = i_{M_1} + a_{11} X + a_{12} W + a_{13} (X \times W) + e_{M_1}$

Equation 2 (d₂₁-path unmoderated):

$M_2 = i_{M_2} + a_2 X + d_{21} M_1 + e_{M_2}$

Equation 3:

$Y = i_Y + c'X + b_1 M_1 + b_2 M_2 + e_Y$

9.5.2 The Conditional Indirect Effects

With $W$ moderating the first a-path, the three indirect effects become:

Path	Conditional IE	Varies with $W$ ?
$X \rightarrow M_1 \rightarrow Y$	$(a_{11} + a_{13}w) \times b_1$	Yes
$X \rightarrow M_2 \rightarrow Y$	$a_2 \times b_2$	No
$X \rightarrow M_1 \rightarrow M_2 \rightarrow Y$	$(a_{11} + a_{13}w) \times d_{21} \times b_2$	Yes

9.5.3 Index of Moderated Mediation

For the path through $M_1$ only: $\text{IMM}_1 = a_{13} \times b_1$

For the sequential path: $\text{IMM}_{serial} = a_{13} \times d_{21} \times b_2$

Both indices can be tested with bootstrapped confidence intervals.

9.5.4 Variants of Sequential Moderated Mediation

Variant	Moderated Path	Hayes Model
W moderates $X \rightarrow M_1$ only	First a-path	Model 83
W moderates $M_1 \rightarrow M_2$ only	d-path	Model 84
W moderates $M_2 \rightarrow Y$ only	Second b-path	Model 85
W moderates $X \rightarrow M_1$ and $M_2 \rightarrow Y$	First a and second b	Model 91

9.5.5 Path Diagram (W Moderates First A-Path)

 W
 ↓ (moderates a₁-path)

X ───────────────→ M₁ ──────────→ M₂ ──────────→ Y a₁₁ + a₁₃W d₂₁ b₂ ↘────────────────────────────────────── b₁ c'

10. Model Fit and Evaluation

10.1 $R^2$ for Each Equation

Because mediation and moderation analyses consist of multiple regression equations, model fit is evaluated separately for each equation in the model.

$R^2$ for equation predicting $M$ :

$R^2_M = 1 - \frac{SS_{res,M}}{SS_{tot,M}}$

$R^2$ for equation predicting $Y$ :

$R^2_Y = 1 - \frac{SS_{res,Y}}{SS_{tot,Y}}$

Adjusted $R^2$ corrects for the number of predictors:

$R^2_{adj} = 1 - (1 - R^2)\frac{n-1}{n-k-1}$

Where $k$ is the number of predictors in the equation.

10.2 $\Delta R^2$ for the Interaction Term

In moderation analysis, the incremental $R^2$ ( $\Delta R^2$ ) quantifies how much variance the interaction term $X \times W$ adds beyond the main effects of $X$ and $W$ :

$\Delta R^2 = R^2_{\text{with interaction}} - R^2_{\text{without interaction}}$

The statistical significance of this increment is tested with an F-test:

$F = \frac{\Delta R^2 / \Delta k}{(1 - R^2_{\text{full}}) / (n - k_{full} - 1)}$

With $\Delta k = 1$ for a single interaction term.

Interpreting $\Delta R^2$ for interactions:

$\Delta R^2$	Interpretation
$< 0.01$	Negligible interaction
$0.01 - 0.05$	Small but potentially meaningful
$0.05 - 0.10$	Moderate interaction
$> 0.10$	Large interaction

⚠️ Interaction effects in observational data are typically small ( $\Delta R^2 = 0.01$ to $0.03$ ). Do not discard a theoretically supported interaction merely because $\Delta R^2$ is small — use power analysis to determine whether your study is adequately powered to detect the interaction.

10.3 Evaluating the Indirect Effect

Point Estimate:

The indirect effect $\hat{a} \times \hat{b}$ is a point estimate. A non-zero point estimate alone is insufficient evidence for mediation.

Bootstrapped Confidence Intervals:

CI Type	Description	Recommendation
Percentile CI	Simple quantiles of bootstrap distribution	Adequate for most purposes
BCa CI	Bias-corrected and accelerated	Preferred — corrects for bias and skewness
Normal-theory CI	Based on asymptotic normality assumption	Not recommended — assumes normality
Monte Carlo CI	Based on parametric sampling from coefficient distributions	Good alternative to bootstrap

Decision rule: If the 95% BCa CI for $ab$ excludes zero → significant indirect effect (mediation is supported). If it includes zero → insufficient evidence for mediation.

10.4 Effect Sizes for Mediation

Several effect size measures are available for the indirect effect:

$\kappa^2$ (Preacher & Kelley, 2011): The ratio of the indirect effect to the maximum possible indirect effect given the data:

$\kappa^2 = \frac{a \times b}{\text{max}(ab)}$

Ranges from 0 to 1. Benchmarks: small = 0.01, medium = 0.09, large = 0.25.

Completely Standardised Indirect Effect ( $ab_{cs}$ ): The indirect effect when all variables are standardised:

$ab_{cs} = ab \times \frac{\sigma_X}{\sigma_Y}$

Proportion Mediated ( $PM$ ):

$PM = \frac{ab}{c}$ (use with caution; see Section 3.3)

10.5 Effect Sizes for Moderation

$f^2$ (Cohen's effect size for regression):

$f^2 = \frac{R^2_{full} - R^2_{reduced}}{1 - R^2_{full}} = \frac{\Delta R^2}{1 - R^2_{full}}$

$f^2$	Effect Size
$0.02$	Small
$0.15$	Medium
$0.35$	Large

💡 For interactions in observational data, $f^2 = 0.005$ to $0.02$ is common. Use this as input for power analysis with GPower or DataStatPro's power module to determine minimum required sample size.*

10.6 Confidence Intervals for Simple Slopes

Each simple slope (conditional effect of $X$ at a specific value of $W$ ) has:

A standard error (derived from the variance-covariance matrix of the coefficients).
A t-statistic (slope / SE).
A 95% confidence interval: slope $\pm 1.96 \times$ SE.

A simple slope whose 95% CI excludes zero is statistically significant — the effect of $X$ on $Y$ is significantly different from zero at that value of $W$ .

10.7 Omnibus Fit Summary

For the full model, report:

Statistic	What to Report	When
$R^2_M$	Variance in $M$ explained by model	All mediation models
$R^2_Y$	Variance in $Y$ explained by model	All models
$F(df_1, df_2)$ , $p$	Overall model significance	All models
$b_3$ (interaction)	Interaction coefficient, SE, $t$ , $p$	All moderation models
$\Delta R^2$	Increment from interaction	All moderation models
$ab$ (indirect)	Indirect effect with BCa CI	All mediation models
$\text{IMM}$	Index of moderated mediation with BCa CI	All conditional process models
$f^2$	Effect size for interaction	Moderation and conditional process
$\kappa^2$	Effect size for indirect effect	Mediation

11. Advanced Topics

11.1 Multicategorical Predictors in Mediation and Moderation

When $X$ is a multicategorical variable with $g$ groups, it must be represented by $g - 1$ dummy variables (or Helmert/effects codes). For a 3-group predictor ( $X$ : Control, Treatment A, Treatment B) with Control as the reference:

$D_1 = 1$ if Treatment A, 0 otherwise

$D_2 = 1$ if Treatment B, 0 otherwise

In mediation: Separate a-paths ( $a_1$ , $a_2$ ) and a single b-path are estimated. The indirect effect for Treatment A vs. Control is $a_1 \times b$ ; for Treatment B vs. Control it is $a_2 \times b$ .

In moderation: Separate interaction terms are formed: $D_1 \times W$ and $D_2 \times W$ . Each interaction tests whether the moderation pattern differs between that treatment and the reference group.

11.2 Covariates in Mediation and Moderation

Covariates (control variables $C$ ) can be added to any equation in the model. Their role is to adjust for confounders and increase precision.

In mediation with covariates:

$M = i_M + aX + g_1 C + e_M$

$Y = i_Y + c'X + bM + g_2 C + e_Y$

The indirect effect $ab$ now represents the effect of $X$ on $Y$ through $M$ , controlling for $C$ . The covariate $C$ can have different coefficients in the two equations ( $g_1 \neq g_2$ ).

Best practice: Include covariates consistently in both the mediator and outcome equations unless there is a specific theoretical reason to exclude them from one equation.

11.3 Within-Person (Longitudinal) Mediation

Standard mediation assumes cross-sectional data. When measurements are taken longitudinally (e.g., $X$ at Time 1, $M$ at Time 2, $Y$ at Time 3), several approaches are available:

Simple longitudinal mediation: Use the same model structure but with temporally ordered variables. The indirect effect $a_{T1T2} \times b_{T2T3}$ is more interpretable causally.

Cross-Lagged Panel Model (CLPM): Simultaneously models the cross-lagged effects ( $X_{T1}$ predicting $M_{T2}$ , $M_{T1}$ predicting $Y_{T2}$ , etc.) and autoregressive effects ( $X_{T1}$ predicting $X_{T2}$ , etc.).

Random Intercept Cross-Lagged Panel Model (RI-CLPM): Separates between-person and within-person effects — more appropriate for causal mediation inference.

11.4 Sensitivity Analysis for Causal Mediation

Because the $M \rightarrow Y$ relationship ( $b$ -path) may be confounded by unmeasured variables, a sensitivity analysis (Imai et al., 2010) quantifies how large a confound would need to be to nullify the indirect effect.

The sensitivity parameter $\rho$ is the correlation between the residuals of the mediator and outcome equations:

$\rho = \text{Cor}(e_M, e_Y)$

If $\rho = 0$ , there is no unmeasured confounding of the $M \rightarrow Y$ relationship. The sensitivity analysis plots the indirect effect against $\rho$ and identifies the critical value $\rho^*$ at which the indirect effect becomes zero. A large $|\rho^*|$ indicates robustness to confounding; a small $|\rho^*|$ indicates fragility.

11.5 Power Analysis for Mediation and Moderation

Power for mediation (Monte Carlo approach):

For a given $a$ , $b$ , $SE_a$ , $SE_b$ , and $n$ , power is the proportion of simulated datasets in which the bootstrapped CI for $ab$ excludes zero:

Simulate $S = 1{,}000$ datasets of size $n$ from the model with coefficients $a$ and $b$ .
In each simulated dataset, compute the bootstrap CI for $ab$ .
Power = proportion of CIs that exclude zero.

Power for moderation (analytical approach):

Given target $f^2$ , desired power (typically 0.80), and $\alpha$ (typically 0.05), use Cohen's power formula:

$n = \frac{L}{f^2} + k + 1$

Where $L$ is obtained from power tables for the F-distribution and $k$ is the number of predictors. In G*Power: use "F-test: Linear Multiple Regression — Fixed model, $R^2$ deviation from zero" with $\Delta R^2 = f^2 / (1 + f^2)$ .

11.6 Reporting Standards for Mediation and Moderation

The American Psychological Association (APA) and Journal Article Reporting Standards (JARS) recommend reporting the following for mediation and moderation:

For Mediation:

Point estimate and 95% BCa CI for the indirect effect $ab$ .
Number of bootstrap samples used.
Point estimates and significance tests for $a$ , $b$ , $c'$ , and $c$ .
Effect size ( $\kappa^2$ or $ab_{cs}$ ).
Proportion mediated (with appropriate caveats).
Path diagram with all coefficients labelled.

For Moderation:

The full regression table including all terms.
$\Delta R^2$ and $F$ -test for the interaction term.
Simple slopes (with SEs and 95% CIs) at low, mean, and high values of $W$ .
Johnson-Neyman region of significance.
An interaction plot.
Effect size ( $f^2$ ).

For Conditional Process:

All of the above, plus the conditional indirect effects table at each value of $W$ .
Index of moderated mediation with bootstrapped 95% CI.

12. Worked Examples

Example 1: Simple Mediation — Stress, Rumination, and Depression

A researcher hypothesises that the effect of perceived stress ( $X$ ) on depression ( $Y$ ) is mediated by rumination ( $M$ ) — a pattern of repetitively thinking about one's distress.

Sample: $n = 300$ adults. Estimator: OLS with BCa bootstrapping ( $B = 10{,}000$ ). Variables: All measured on validated scales; all continuous.

Model equations:

Equation 1 (predicting rumination from stress):

$\hat{M} = 8.42 + 0.62X$

$a = 0.62, \; SE = 0.07, \; t(298) = 8.86, \; p < .001$

Equation 2 (predicting depression from stress and rumination):

$\hat{Y} = 3.15 + 0.19X + 0.54M$

$c' = 0.19, \; SE = 0.08, \; t(297) = 2.38, \; p = .018$

$b = 0.54, \; SE = 0.06, \; t(297) = 9.00, \; p < .001$

Total effect equation:

$\hat{Y} = 7.70 + 0.52X$

$c = 0.52, \; SE = 0.07, \; t(298) = 7.43, \; p < .001$

Effects Decomposition:

Effect	Estimate	SE	95% BCa CI	Significant?
Total effect ( $c$ )	0.520	0.070	[0.382, 0.658]	Yes
Direct effect ( $c'$ )	0.190	0.080	[0.033, 0.347]	Yes
Indirect effect ( $ab$ )	0.335	—	[0.215, 0.465]	Yes
Proportion mediated	64.4%	—	—	—

Verification: $c' + ab = 0.190 + 0.335 = 0.525 \approx c = 0.520$ ✅ (rounding)

Effect Size: $\kappa^2 = 0.28$ — large indirect effect.

Interpretation: The indirect effect of stress on depression through rumination is $ab = 0.335$ (95% BCa CI [0.215, 0.465]), indicating that rumination significantly mediates the stress-depression relationship. This is partial mediation — both the indirect effect ( $ab = 0.335$ , CI excludes zero) and the direct effect ( $c' = 0.190$ , $p = .018$ ) are statistically significant. Approximately 64% of the total effect of stress on depression ( $c = 0.520$ ) is transmitted through rumination. Each one-unit increase in perceived stress is associated with a 0.62-unit increase in rumination ( $a = 0.62$ ), which in turn is associated with a 0.54-unit increase in depression controlling for stress ( $b = 0.54$ ), yielding a total indirect effect of $0.62 \times 0.54 = 0.335$ points on the depression scale.

Example 2: Parallel Multiple Mediation — Training, Motivation, Self-Efficacy, and Performance

A management researcher hypothesises that training ( $X$ ) improves job performance ( $Y$ ) through two parallel mechanisms: intrinsic motivation ( $M_1$ ) and self-efficacy ( $M_2$ ).

Sample: $n = 320$ employees. Method: BCa bootstrap, $B = 10{,}000$ .

Model equations:

$\hat{M}_1 = 2.10 + 0.48X$ → $a_1 = 0.48$ , $SE = 0.09$ , $p < .001$

$\hat{M}_2 = 3.25 + 0.61X$ → $a_2 = 0.61$ , $SE = 0.08$ , $p < .001$

$\hat{Y} = 1.82 + 0.22X + 0.39M_1 + 0.31M_2$

$c' = 0.22$ , $SE = 0.11$ , $p = .046$

$b_1 = 0.39$ , $SE = 0.08$ , $p < .001$ (motivation → performance)

$b_2 = 0.31$ , $SE = 0.07$ , $p < .001$ (self-efficacy → performance)

Total effect: $c = 0.22 + (0.48 \times 0.39) + (0.61 \times 0.31) = 0.22 + 0.187 + 0.189 = 0.596$

Specific Indirect Effects:

Path	IE	95% BCa CI	Significant?
Training → Motivation → Performance	$0.48 \times 0.39 = 0.187$	[0.086, 0.311]	Yes
Training → Self-Efficacy → Performance	$0.61 \times 0.31 = 0.189$	[0.091, 0.308]	Yes
Total indirect	$0.376$	[0.201, 0.551]	Yes
Direct effect ( $c'$ )	$0.220$	[0.004, 0.436]	Yes
Contrast ( $M_1$ vs. $M_2$ )	$0.187 - 0.189 = -0.002$	[-0.152, 0.143]	No

Interpretation: Both intrinsic motivation (IE = 0.187, 95% BCa CI [0.086, 0.311]) and self-efficacy (IE = 0.189, 95% BCa CI [0.091, 0.308]) significantly mediate the effect of training on job performance. The contrast between the two indirect effects is not significant (contrast = -0.002, 95% CI [-0.152, 0.143]), indicating that intrinsic motivation and self-efficacy transmit approximately equal portions of training's effect on performance. The total indirect effect (0.376) accounts for 63% of the total effect, indicating substantial partial mediation.

Example 3: Simple Moderation — Social Support Buffers Stress Effects on Burnout

A researcher hypothesises that the effect of work demands ( $X$ ) on burnout ( $Y$ ) is weaker when social support ( $W$ ) is high. All variables are mean-centred.

Sample: $n = 280$ nurses. Method: OLS regression.

Model:

$Y = b_0 + b_1 X_c + b_2 W_c + b_3 (X_c \times W_c) + e$

Results:

Term	$b$	SE	$t$	$p$	95% CI
Constant	3.82	0.08	47.75	$<.001$	[3.66, 3.98]
Work Demands ( $b_1$ )	0.58	0.09	6.44	$<.001$	[0.400, 0.760]
Social Support ( $b_2$ )	-0.31	0.08	-3.88	$<.001$	[-0.468, -0.152]
Work Demands × Social Support ( $b_3$ )	-0.19	0.06	-3.17	$.002$	[-0.308, -0.072]

$R^2 = .428$ , $F(3, 276) = 68.75$ , $p < .001$

$\Delta R^2 = .041$ for interaction, $F(1, 276) = 10.05$ , $p = .002$ , $f^2 = .072$ (moderate)

Simple Slopes (Spotlight Analysis):

Values of $W$ : $W_{Low} = -0.85$ (−1SD), $W_{Mean} = 0$ , $W_{High} = +0.85$ (+1SD)

$\hat{\theta}(w) = 0.58 + (-0.19) \times w$

$W$ Level	$w$	Simple Slope	SE	$t$	$p$	95% CI
Low ( $-1$ SD)	$-0.85$	$0.58 + (-0.19)(-0.85) = 0.741$	0.121	6.12	$<.001$	[0.503, 0.979]
Mean	$0.00$	$0.58 + (-0.19)(0.00) = 0.580$	0.090	6.44	$<.001$	[0.400, 0.760]
High ( $+1$ SD)	$+0.85$	$0.58 + (-0.19)(0.85) = 0.419$	0.112	3.74	$<.001$	[0.199, 0.639]

Johnson-Neyman Analysis:

The effect of work demands on burnout is statistically significant ( $p < .05$ ) across the entire observed range of social support. However, the effect is significantly smaller at high social support ( $\theta = 0.419$ ) than at low social support ( $\theta = 0.741$ ).

Interaction Plot:

Work demands effects on burnout:

Low support ( $-1\text{SD}$ ): Steep positive slope — high demands strongly increase burnout.
Mean support: Moderate positive slope.
High support ( $+1\text{SD}$ ): Flatter positive slope — demands still increase burnout but less strongly.

Interpretation: There is a significant negative interaction between work demands and social support on burnout ( $b_3 = -0.19$ , $p = .002$ , $\Delta R^2 = .041$ ). Consistent with the buffering hypothesis, social support significantly attenuates the effect of work demands on burnout. The simple slope of work demands on burnout is 0.741 at low social support ( $p < .001$ ) but only 0.419 at high social support ( $p < .001$ ). Although the effect of work demands remains significant at all levels of social support, its magnitude is substantially smaller when social support is high, providing support for the buffering effect of social support against work-related burnout.

Example 4: First-Stage Moderated Mediation — Exercise, Self-Efficacy, and Wellbeing (Moderated by Age)

A researcher proposes that age ( $W$ ) moderates the first stage of the mediated process by which exercise ( $X$ ) affects wellbeing ( $Y$ ) through self-efficacy ( $M$ ). Specifically, exercise may boost self-efficacy more strongly in younger adults than older adults, because younger people experience more immediate fitness gains from exercise.

Sample: $n = 400$ adults (ages 18–70). Method: BCa bootstrap, $B = 10{,}000$ . All continuous predictors mean-centred.

Equation 1 (a-path moderated by Age):

$\hat{M} = 42.15 + 0.61X_c + (-0.18)W_c + (-0.24)(X_c \times W_c)$

$a_1 = 0.61$ , $SE = 0.09$ , $p < .001$ (effect of exercise on self-efficacy at mean age)

$a_2 = -0.18$ , $SE = 0.07$ , $p = .011$ (effect of age on self-efficacy)

$a_3 = -0.24$ , $SE = 0.08$ , $p = .003$ (interaction: exercise × age on self-efficacy)

Equation 2 (b-path unmoderated):

$\hat{Y} = 58.32 + 0.18X_c + 0.43M$

$b = 0.43$ , $SE = 0.07$ , $p < .001$ ; $c' = 0.18$ , $SE = 0.10$ , $p = .072$

Conditional Indirect Effects:

$\text{IE}(w) = (a_1 + a_3 w) \times b = (0.61 - 0.24w) \times 0.43$

Values of $W$ (Age): $W_{Low} = -10.5$ years (−1SD); $W_{Mean} = 0$ ; $W_{High} = +10.5$ years (+1SD)

Age Level	$w$	Conditional a-path	$\text{IE}(w) = (0.61 - 0.24w) \times 0.43$	95% BCa CI	Significant?
Young ( $-1$ SD)	$-10.5$	$0.61 + 2.52 = 3.13$	$3.13 \times 0.43 = 1.346$	[0.842, 1.887]	Yes
Mean Age	$0$	$0.61$	$0.61 \times 0.43 = 0.262$	[0.108, 0.435]	Yes
Older ( $+1$ SD)	$+10.5$	$0.61 - 2.52 = -1.91$	$-1.91 \times 0.43 = -0.821$	[-1.298, -0.389]	Yes

💡 Note: The reversal of the indirect effect direction at high age indicates that for older adults, exercise may unexpectedly reduce self-efficacy — perhaps due to increased fatigue or injury risk. This merits theoretical attention.

Index of Moderated Mediation:

$\text{IMM} = a_3 \times b = -0.24 \times 0.43 = -0.103$

95% BCa CI for IMM: $[-0.178, -0.036]$ → excludes zero → significant moderated mediation

Interpretation: The indirect effect of exercise on wellbeing through self-efficacy is significantly moderated by age (IMM = −0.103, 95% BCa CI [−0.178, −0.036]). The conditional indirect effects reveal a striking pattern: for younger adults ( $-1\text{SD}$ below the mean age), the indirect effect is large and positive (IE = 1.346, 95% CI [0.842, 1.887]), indicating that exercise strongly boosts self-efficacy which in turn improves wellbeing. For adults at the mean age, the indirect effect is smaller but still positive (IE = 0.262, 95% CI [0.108, 0.435]). For older adults ( $+1\text{SD}$ above mean age), the indirect effect reverses sign (IE = −0.821, 95% CI [−1.298, −0.389]), suggesting that exercise may counterintuitively reduce self-efficacy in this group. These findings highlight the critical role of age in determining whether exercise promotes self-efficacy and, ultimately, wellbeing.

Example 5: Second-Stage Moderated Mediation — Leadership, Trust, and Performance (Moderated by Autonomy)

A researcher tests whether autonomy ( $W$ ) moderates the second stage (b-path) of the process by which transformational leadership ( $X$ ) improves team performance ( $Y$ ) through team trust ( $M$ ). Trust may translate into performance more effectively in high- autonomy environments.

Sample: $n = 350$ teams. Method: BCa bootstrap, $B = 10{,}000$ .

Equation 1 (a-path unmoderated):

$\hat{M} = 2.85 + 0.52X_c$ → $a = 0.52$ , $SE = 0.08$ , $p < .001$

Equation 2 (b-path moderated by Autonomy):

$\hat{Y} = 3.42 + 0.15X_c + 0.38M_c + 0.22W_c + 0.29(M_c \times W_c)$

$b_1 = 0.38$ , $b_3 = 0.29$ , $SE_{b_3} = 0.09$ , $p = .001$

Conditional Indirect Effects:

$\text{IE}(w) = a \times (b_1 + b_3 w) = 0.52 \times (0.38 + 0.29w)$

Autonomy Level	$w$	Conditional b-path	$\text{IE}(w)$	95% BCa CI	Significant?
Low ( $-1$ SD)	$-0.82$	$0.38 + (0.29)(-0.82) = 0.142$	$0.52 \times 0.142 = 0.074$	[-0.038, 0.193]	No
Mean	$0$	$0.38$	$0.52 \times 0.38 = 0.198$	[0.087, 0.325]	Yes
High ( $+1$ SD)	$+0.82$	$0.38 + (0.29)(0.82) = 0.618$	$0.52 \times 0.618 = 0.321$	[0.181, 0.476]	Yes

Index of Moderated Mediation:

$\text{IMM} = a \times b_3 = 0.52 \times 0.29 = 0.151$

95% BCa CI for IMM: $[0.063, 0.253]$ → excludes zero → significant moderated mediation

Interpretation: The indirect effect of transformational leadership on team performance through trust is significantly moderated by team autonomy (IMM = 0.151, 95% BCa CI [0.063, 0.253]). The trust-based indirect pathway is significant for teams with mean autonomy (IE = 0.198, 95% CI [0.087, 0.325]) and high autonomy (IE = 0.321, 95% CI [0.181, 0.476]), but not for teams with low autonomy (IE = 0.074, 95% CI [−0.038, 0.193]). This suggests that transformational leadership builds trust, but this trust only translates into improved performance when the team operates in a sufficiently autonomous environment where trust can meaningfully guide decision-making and collaborative action.

13. Common Mistakes and How to Avoid Them

Mistake 1: Using the Baron-Kenny Causal Steps Approach

Problem: The Baron-Kenny (1986) procedure requires $X$ to significantly predict $Y$ before testing mediation. This is incorrect — a significant indirect effect can exist even when the total effect $c$ is not significant (e.g., in the case of inconsistent mediation where direct and indirect effects have opposite signs). The Baron-Kenny approach also uses the Sobel test, which incorrectly assumes the sampling distribution of $ab$ is normal.
Solution: Use the modern approach: bootstrap the indirect effect and construct a 95% BCa CI. Test mediation by whether the CI for $ab$ excludes zero — regardless of the significance of $c$ .

Mistake 2: Claiming Causal Mediation from Cross-Sectional Data

Problem: Mediation analysis uses causal language (" $X$ affects $Y$ through $M$ ") that implies temporal precedence and absence of confounding. Cross-sectional data cannot establish either condition — all three variables are measured simultaneously, making it impossible to confirm that $X$ caused $M$ before $M$ caused $Y$ .
Solution: Acknowledge this limitation explicitly. Use language such as "consistent with a mediation process" or "results support the hypothesised mediation pathway." Where possible, use longitudinal designs or experimental manipulation of $X$ and/or $M$ .

Mistake 3: Not Mean-Centering Before Computing the Interaction Term

Problem: Using raw (uncentred) scores to compute the interaction term $X \times W$ often produces severe multicollinearity — VIFs for $X$ , $W$ , and $X \times W$ can exceed 30–50. This inflates standard errors and makes the main effects ( $b_1$ and $b_2$ ) uninterpretable (they represent the effect at $X = 0$ and $W = 0$ , which may be completely outside the observed range of the data).
Solution: Always mean-centre (or standardise) continuous predictors and moderators before computing interaction terms. This reduces multicollinearity, makes main effects interpretable at the mean of the other variable, and has no effect on the interaction coefficient $b_3$ .

Mistake 4: Interpreting the Main Effect of X as the "Effect of X" in a Moderation Model

Problem: In a moderation model $Y = b_0 + b_1 X + b_2 W + b_3 (X \times W)$ , $b_1$ is NOT the overall effect of $X$ on $Y$ . It is the conditional effect of $X$ when $W = 0$ (or at the mean of $W$ if centred). Reporting $b_1$ as the "main effect of $X$ " and interpreting it in isolation is incorrect when a significant interaction is present.
Solution: When the interaction is significant, focus on the simple slopes at meaningful values of $W$ rather than interpreting $b_1$ in isolation. Only report $b_1$ as "the effect of $X$ at the mean of $W$ " (after centering).

Mistake 5: Running Separate Simple Mediations Instead of Parallel Mediation

Problem: When testing multiple mediators, some researchers run separate simple mediation models (one mediator at a time). This produces biased estimates because each specific indirect effect does not control for the other mediators. The total of the specific indirect effects from separate models will not equal the total indirect effect from a simultaneous model.
Solution: Always include all hypothesised mediators in a single parallel mediation model. The specific indirect effects from this model correctly partial out the shared variance among mediators.

Mistake 6: Ignoring the Direction and Meaning of the Interaction

Problem: Researchers sometimes report "a significant interaction was found" without fully probing and plotting it. A significant $b_3$ is only the starting point — the nature and direction of the interaction must be unpacked through simple slopes analysis and a plot to be meaningfully interpreted and communicated.
Solution: Always: (a) compute and report simple slopes at low, mean, and high values of $W$ ; (b) create and include an interaction plot; (c) describe in plain language what the interaction means (e.g., "the effect of $X$ on $Y$ was stronger/weaker/reversed when $W$ was high").

Mistake 7: Concluding Partial vs. Full Mediation Based on $c'$ Significance

Problem: Classifying mediation as "partial" ( $c'$ significant) or "full" ( $c'$ not significant) based on whether the direct effect is statistically significant is misleading. With large samples, trivially small direct effects are "significant," leading to "partial mediation." With small samples, substantial direct effects may be "non-significant," incorrectly suggesting "full mediation." Statistical significance of $c'$ depends on sample size, not on the theoretical importance of the direct path.
Solution: Report and interpret the magnitude of $ab$ relative to $c$ (proportion mediated or $\kappa^2$ ). Reserve "full mediation" for cases where the direct effect is both non-significant AND practically negligible (close to zero in effect size terms).

Mistake 8: Using Too Few Bootstrap Samples

Problem: Using fewer than 1,000 bootstrap samples produces unstable CI estimates, especially near the CI boundaries. The same analysis run twice with 500 bootstrap samples may produce noticeably different CI bounds.
Solution: Use a minimum of $B = 5{,}000$ bootstrap samples for exploratory work and $B = 10{,}000$ for published research. DataStatPro defaults to 5,000; increase to 10,000 for publication-ready analyses.

Mistake 9: Ignoring Sequential Ordering in Serial Mediation

Problem: In sequential mediation, the causal order of mediators ( $M_1 \rightarrow M_2$ vs. $M_2 \rightarrow M_1$ ) is a theoretical claim that changes all model equations, indirect effects, and conclusions. Researchers sometimes arbitrarily choose an order or test both orders and report the "better" one without theoretical justification.
Solution: Specify the causal order of mediators based on theory and, ideally, temporal precedence in data collection. If the order is theoretically ambiguous, conduct both models as a sensitivity check and clearly acknowledge the uncertainty in conclusions.

Mistake 10: Failing to Report the Index of Moderated Mediation

Problem: In conditional process models, researchers sometimes report only the conditional indirect effects at specific values of $W$ (e.g., "the indirect effect was significant at high $W$ but not at low $W$ "), without testing whether the indirect effect is significantly moderated overall. Differences in significance at different values of $W$ do not by themselves constitute evidence that the indirect effect differs significantly across those values.
Solution: Always compute and report the index of moderated mediation ( $\text{IMM} = a_3 b$ or $\text{IMM} = a b_3$ ) with its bootstrapped 95% CI. A CI that excludes zero is the primary test that the indirect effect is significantly moderated by $W$ .

14. Troubleshooting

Problem	Likely Cause	Solution
Indirect effect CI is very wide	Small sample; weak $a$ or $b$ path; high variability	Increase sample size; use standardised variables; check for outliers
Indirect effect is non-zero but CI includes zero	Insufficient power; inconsistent mediation	Increase $n$ ; increase $B$ (bootstrap samples); check for sign changes in paths
$a + b + c'$ do not sum to $c$ (large discrepancy)	Rounding error; mediators correlated; computational error	Verify variable coding; use unstandardised coefficients; recheck calculations
VIF $> 10$ for $X$ , $W$ , or $X \times W$	Predictors not centred; extreme multicollinearity	Mean-centre all continuous predictors and moderators before computing interactions
Interaction non-significant despite theoretical prediction	Insufficient power for small $f^2$ ; wrong functional form; wrong moderator	Conduct a priori power analysis; check for non-linear interaction; try alternative moderators
Simple slopes all in same direction despite significant interaction	The interaction changes magnitude but not direction	Report the magnitude change; use Johnson-Neyman to find where effect size changes meaningfully
Proportion mediated exceeds 1.0 or is negative	Inconsistent mediation ( $c'$ and $ab$ have opposite signs); total effect near zero	Do not interpret proportion mediated in this case; report only $ab$ and $c'$ separately
Bootstrapped CI is asymmetric around point estimate	Skewed bootstrap distribution; small $n$ ; near-zero paths	Expected and acceptable; BCa CI handles this; do not report Sobel-based symmetric CI
$R^2_M$ is very small ( $< 0.05$ )	$X$ is a weak predictor of $M$ ; a-path is trivially small	Check whether $M$ is the correct mediator; assess $a$ and $b$ path magnitudes separately
Model equations fail to converge	Perfect multicollinearity; sample too small for model complexity	Reduce predictors; mean-centre; simplify model; collect more data
All simple slopes significant at all values of $W$	Very strong main effect dominates interaction	Interaction may still be meaningful; report magnitude changes using Johnson-Neyman
Sequential mediation indirect effects do not sum correctly	Error in path specification; correlations among mediators not accounted for	Check that the model equation for $M_2$ includes $M_1$ as a predictor
IMM (index of moderated mediation) CI includes zero despite conditional IEs varying	Conditional IEs may differ descriptively but not statistically	IMM is the correct test; do not claim significant moderated mediation without IMM CI excluding zero
Bootstrap results differ substantially across runs	Too few bootstrap samples; seed not set	Increase $B$ to 10,000; set a fixed random seed for reproducibility

15. Quick Reference Cheat Sheet

Core Equations

Formula	Description
$M = i_M + aX + e_M$	Path a equation (simple mediation)
$Y = i_Y + c'X + bM + e_Y$	Path b and c' equation (simple mediation)
$\text{IE} = a \times b$	Indirect effect
$c = c' + ab$	Total effect identity
$PM = ab/c$	Proportion mediated (use with caution)
$\text{SEM} = \sigma_X\sqrt{1 - R^2}$	Standard error of the indirect effect (Sobel)
$Y = b_0 + b_1X + b_2W + b_3(X \times W) + e$	Simple moderation model
$\hat{\theta}_{XY}(w) = b_1 + b_3 w$	Conditional effect of $X$ at $W = w$
$SE[\hat{\theta}(w)] = \sqrt{\widehat{Var}(b_1) + 2w\widehat{Cov}(b_1,b_3) + w^2\widehat{Var}(b_3)}$	SE of simple slope
$\text{IE}(w) = (a_1 + a_3 w) \times b$	Conditional IE (first-stage moderation)
$\text{IE}(w) = a \times (b_1 + b_3 w)$	Conditional IE (second-stage moderation)
$\text{IE}(w) = (a_1 + a_3 w)(b_1 + b_3 w)$	Conditional IE (both-stage moderation)
$\text{IMM} = a_3 \times b$	Index of moderated mediation (first-stage)
$\text{IMM} = a \times b_3$	Index of moderated mediation (second-stage)
$\Delta R^2 = R^2_{full} - R^2_{reduced}$	Incremental $R^2$ for interaction
$f^2 = \Delta R^2 / (1 - R^2_{full})$	Cohen's $f^2$ for interaction

Model Selection Guide

Research Question	Appropriate Model
How does $X$ affect $Y$ ? (one mediator)	Simple Mediation
Does $X$ affect $Y$ through multiple mechanisms?	Parallel Multiple Mediation
Is there a causal chain $X \rightarrow M_1 \rightarrow M_2 \rightarrow Y$ ?	Sequential Mediation
When/for whom does $X$ affect $Y$ ? (one moderator)	Simple Moderation
Do two moderators jointly change the $X \rightarrow Y$ effect?	Multiple Moderation
Does the moderation of $X \rightarrow Y$ vary with a third variable?	Moderated Moderation (Three-Way)
Does the indirect effect ( $ab$ ) depend on level of $W$ ? (W on a-path)	First-Stage Moderated Mediation
Does the indirect effect ( $ab$ ) depend on level of $W$ ? (W on b-path)	Second-Stage Moderated Mediation
Does $W$ simultaneously moderate both $a$ and $b$ paths?	Both-Stage Moderated Mediation
Why does the $X \times W$ interaction on $Y$ exist?	Mediated Moderation
Does $W$ moderate a path in a $X \rightarrow M_1 \rightarrow M_2 \rightarrow Y$ chain?	Sequential Moderated Mediation

Mediation Effect Interpretation

Component	Symbol	Interpretation
a-path	$a$	Effect of $X$ on $M$
b-path	$b$	Effect of $M$ on $Y$ , controlling for $X$
Direct effect	$c'$	Effect of $X$ on $Y$ , controlling for $M$
Total effect	$c$	Effect of $X$ on $Y$ without $M$ in model
Indirect effect	$ab$	Effect of $X$ on $Y$ transmitted through $M$
Proportion mediated	$ab/c$	Fraction of total effect through $M$

Moderation Effect Interpretation

Pattern	Sign of $b_3$	Interpretation
Enhancing moderation	$+$	High $W$ strengthens $X \rightarrow Y$
Buffering moderation	$-$	High $W$ weakens $X \rightarrow Y$
Crossover interaction	$\pm$ (changes sign)	High $W$ reverses direction of $X \rightarrow Y$
No moderation	$\approx 0$	Effect of $X$ on $Y$ is constant across $W$

Key Decision Rules

Decision Point	Rule
Is mediation significant?	95% BCa CI for $ab$ excludes zero
Is moderation significant?	$b_3 \neq 0$ , $p < .05$ , $\Delta R^2 > 0$
Is conditional process significant?	95% BCa CI for IMM excludes zero
Is simple slope significant at value $w$ ?	95% CI for $\hat{\theta}(w)$ excludes zero
Should I use Baron-Kenny?	No — always use bootstrapped indirect effects
Full vs. partial mediation?	Avoid classification; report $ab$ magnitude and $c'$ effect size

Effect Size Benchmarks

Statistic	Small	Medium	Large
$\kappa^2$ (indirect effect)	0.01	0.09	0.25
$f^2$ (interaction)	0.02	0.15	0.35
$\Delta R^2$ (interaction, observational)	0.01	0.05	0.10
$ab_{cs}$ (standardised indirect)	0.01	0.06	0.14

Minimum Sample Size Guidelines

Model	Minimum $n$	Recommended $n$
Simple mediation	100	$\geq 200$
Parallel mediation (2 mediators)	150	$\geq 300$
Sequential mediation (2 mediators)	200	$\geq 400$
Simple moderation	100	$\geq 200$
Multiple moderation	200	$\geq 300$
Three-way (moderated moderation)	300	$\geq 500$
Moderated mediation (one stage)	200	$\geq 400$
Both-stage moderated mediation	300	$\geq 500$
Sequential moderated mediation	350	$\geq 600$

Bootstrap Confidence Interval Recommendations

Situation	Method	B
Exploratory / pilot research	Percentile CI	5,000
Published research (standard)	BCa CI	10,000
Very skewed distribution	BCa CI	10,000
Robustness check	Monte Carlo CI	20,000
Near-zero indirect effect	BCa CI	10,000

This tutorial provides a comprehensive foundation for understanding, specifying, estimating, and interpreting Mediation, Moderation, and Conditional Process Analysis using the DataStatPro application. For further reading, consult Hayes' "Introduction to Mediation, Moderation, and Conditional Process Analysis" (3rd ed., 2022), MacKinnon's "Introduction to Statistical Mediation Analysis" (2008), and Preacher & Hayes' "Asymptotic and Resampling Strategies for Assessing and Comparing Indirect Effects in Multiple Mediator Models" (2008). For feature requests or support, contact the DataStatPro team.

Mediation and Moderation Analysis

Mediation and Moderation Analysis: Zero to Hero Tutorial

Table of Contents

1. Prerequisites and Background Concepts

1.1 Simple and Multiple Linear Regression

1.2 Standardised vs. Unstandardised Coefficients

1.3 Causal Diagrams (Path Diagrams)

1.4 The Product of Coefficients

1.5 Interaction Terms

1.6 The Concept of Conditional Effects

1.7 Bootstrapping

2. What is Mediation and Moderation Analysis?

2.1 The Core Questions

2.2 The Fundamental Distinction

2.3 Visual Summary of Core Concepts

2.4 Real-World Applications

2.5 A Brief History

3. The Mathematics Behind Mediation and Moderation

3.1 The Simple Mediation Model — Equations

3.2 The Three Effects in Mediation

3.3 The Proportion Mediated

3.4 Multiple Mediators — Parallel Mediation

3.5 Sequential (Serial) Mediation

3.6 The Simple Moderation (Interaction) Model

3.7 Centering in Moderation Analysis

3.8 Probing the Interaction: Simple Slopes

3.9 The Conditional Process Model — Moderated Mediation

4. Assumptions of Mediation and Moderation Analysis

4.1 Causal Ordering (Temporal Precedence)

4.2 No Unmeasured Confounding

4.3 Linearity

4.4 Normally Distributed Residuals

4.5 Homoscedasticity

4.6 Independence of Observations

4.7 No Perfect Multicollinearity

4.8 Adequate Sample Size and Statistical Power

5. Types of Mediation and Moderation Models

5.1 Mediation Models

5.2 Moderation Models

5.3 Conditional Process Models (Moderated Mediation / Mediated Moderation)

5.4 Key Terminology Clarification

6. Using the Mediation and Moderation Component

Step-by-Step Guide

7. Mediation Analysis

7.1 Simple Mediation

7.1.1 Model Specification

7.1.2 The Four Conditions (Baron & Kenny, Historical)

7.1.3 Modern Approach: Bootstrapped Indirect Effects

7.1.4 Types of Mediation Based on Effect Sizes

7.1.5 Worked Calculation — Simple Mediation

7.2 Parallel Multiple Mediation

7.2.1 Model Specification (Two Mediators)

7.2.2 Effects Decomposition

7.2.3 Testing Contrasts Between Indirect Effects

7.2.4 Why Parallel Mediation is Preferred Over Separate Analyses

7.3 Sequential (Serial) Mediation

7.3.1 Model Specification (Two Sequential Mediators)

7.3.2 The Three Indirect Effects

7.3.3 When to Use Sequential vs. Parallel Mediation

7.3.4 Extending to Three or More Sequential Mediators

8. Moderation Analysis

8.1 Simple Moderation

8.1.1 Model Specification

8.1.2 Interpreting the Interaction Coefficient

8.1.3 Simple Slopes Analysis (Spotlight Analysis)

8.1.4 Visualising Moderation

8.1.5 Moderator Variable Types

8.2 Multiple Moderation

8.2.1 Model Specification

8.2.2 Interpreting Multiple Moderation Results

8.3 Moderated Moderation (Three-Way Interaction)

8.3.1 Model Specification

8.3.2 Probing Three-Way Interactions

8.3.3 Visualising Three-Way Interactions

9. Conditional Process Analysis

9.1 Moderated Mediation — First-Stage (W Moderates the a-Path)

9.1.1 Model Equations

9.1.2 The Conditional Indirect Effect

9.1.3 Index of Moderated Mediation

9.1.4 Path Diagram

9.4.2 The Indirect Effect of $X \times W$ Through $M$

10.1 $R^2$ for Each Equation

10.2 $\Delta R^2$ for the Interaction Term

Mistake 7: Concluding Partial vs. Full Mediation Based on $c'$ Significance