ANOVA Assumptions and
Kruskal-Wallis

STA4173: Biostatistics
Spring 2025

Introduction: ANOVA Assumptions

We previously discussed testing three or more means using ANOVA.
We also discussed that ANOVA is an extension of the two-sample t-test.
Recall that the t-test has two assumptions:
- Equal variance between groups.
- Normal distribution.
We will extend our knowledge of checking assumptions today.

ANOVA Assumptions: Definition

We can represent ANOVA with the following model:

y_{ij} = \mu + \tau_i + \varepsilon_{ij}

where:
- y_{ij} is the j^{\text{th}} observation in the i^{\text{th}} group,
- \mu is the overall (grand) mean,
- \tau_i is the treatment effect for group i, and
- \varepsilon_{ij} is the error term for the j^{\text{th}} observation in the i^{\text{th}} group.

ANOVA Assumptions: Definition

We assume that the error term follows a normal distribution with mean 0 and a constant variance, \sigma^2. i.e., \varepsilon_{ij} \overset{\text{iid}}{\sim} N(0, \sigma^2)
Very important note: the assumption is on the error term and NOT on the outcome!
We will use the residual (the difference between the observed value and the predicted value) to assess assumptions: e_{ij} = y_{ij} - \hat{y}_{ij}

ANOVA Assumptions: Graphical Assessment

Normality: quantile-quantile plot
- Should have points close to the 45^\circ line
- We will focus on the “center” portion of the plot
Variance: scatterplot of the residuals against the predicted values
- Should be “equal spread” between the groups
- No “pattern”

ANOVA Assumptions: Graphical Assessment

Like with t-tests, we will assess these assumptions graphically.
We will return to the classpackage package and use the anova_check() function.

library(classpackage) 
anova_check(m)

ANOVA Assumptions: Graphical Assessment

Recall the dental example from last week,

library(tidyverse)
strength <- c(15.4, 12.9, 17.2, 16.6, 19.3,
              17.2, 14.3, 17.6, 21.6, 17.5,
               5.5,  7.7, 12.2, 11.4, 16.4,
              11.0, 12.4, 13.5,  8.9,  8.1)
system <- c(rep("Cojet",5), rep("Silistor",5), rep("Cimara",5), rep("Ceramic",5))
data <- tibble(system, strength)
m <- aov(strength ~ system, data = data)
summary(m)

            Df Sum Sq Mean Sq F value  Pr(>F)   
system       3  200.0   66.66   7.545 0.00229 **
Residuals   16  141.4    8.84                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ANOVA Assumptions: Assessing Graphically

Let’s assess the assumptions,

library(classpackage)
anova_check(m)

ANOVA Assumptions: Test for Variance

We can formally check the variance assumption with the Brown-Forsythe-Levine test.
- This test transforms the data and then performs ANOVA!
The test statistic is calculated as follows, F_0 = \frac{\sum_{i=1}^k n_i (\bar{z}_i - \bar{z})^2/(k-1)}{\sum_{i=1}^k \sum_{j=1}^{n_j}(z_{ij}-\bar{z}_i)^2/(n-k) }, where
- k is the number of groups,
- n_i is the sample size of group i,
- n = \sum_{i=1}^k n_i, and
- z_{ij} = |y_{ij} - \text{median}(y_i)|

ANOVA Assumptions: Test for Variance

Hypotheses
- H_0: \ \sigma^2_1 = ... = \sigma^2_k
- H_1: at least one \sigma^2_i is different
Test Statistic
- F_0 (take from resulting ANOVA table)
p-Value
- p = P[F_{\text{df}_{\text{Trt}}, \text{df}_{\text{E}}} \ge F_0]
Rejection Region
- Reject if p < \alpha.

ANOVA Assumptions: Test for Variance

We will use the leveneTest() function from the car package.
- Note: I do not load the car package because it overwrites a necessary function in tidyverse.

car::leveneTest(model_results)

In our dental example,

car::leveneTest(m)

ANOVA Assumptions: Test for Variance

Hypotheses
- H_0: \ \sigma^2_1 = \sigma^2_2 = \sigma^2_3 = \sigma^2_4
- H_1: at least one \sigma^2_i is different
Test Statistic and p-Value
- F_0 = 0.734
- p = 0.547
Rejection Region
- Reject if p < \alpha; \alpha=0.01.
Conclusion/Interpretation
- Fail to reject H_0. There is not sufficient evidence to suggest that the variances are different (i.e., the variance assumption is not broken).

Introduction: Kruskal-Wallis

We just discussed the ANOVA assumptions.

\varepsilon_{ij} \overset{\text{iid}}{\sim} N(0, \sigma^2)

We also discussed how to assess the assumptions:
- Graphically using the anova_check() function.
- Confirming the variance assumption using the BFL.
If we break an assumption, we will turn to the nonparametric alternative, the Kruskal-Wallis.

Kruskal-Wallis Test

If we break ANOVA assumptions, we should implement the nonparametric version, the Kruskal-Wallis.
The Kruskal-Wallis test determines if k independent samples come from populations with the same distribution.
Our new hypotheses are
- H_0: M_1 = ... = M_k
- H_1: at least one M_i is different

Kruskal-Wallis Test

The test statistic is as follows:

\chi^2_0 = \frac{12}{n(n+1)} \sum_{i=1}^k \frac{R_i^2}{n_i} - 3(n+1),

where
- R_i is the sum of the ranks for group i,
- n_i is the sample size for group i,
- n = \sum_{i=1}^k n_i = total sample size, and
- k is the number of groups.
H follows a \chi^2 distribution with k-1 degrees of freedom.

Kruskal-Wallis Test

Hypotheses
- H_0: \ M_1 = ... = M_k
- H_1: at least one M_i is different
Test Statistic
- \chi^2_0 = \frac{12}{n(n+1)} \sum_{i=1}^k \frac{R_i^2}{n_i} - 3(n+1)
p-Value
- p = P[\chi^2_{k-1} \ge \chi^2_0]
Rejection Region
- Reject H_0 if p < \alpha

Kruskal-Wallis Test

We will use the kruskal.test() function to perform the Kruskal-Wallis test.

kruskal.test(continuous_variable ~ grouping_variable, 
             data = dataset_name)

Applying this to our dental dataset,

kruskal.test(strength ~ system, data = data)


    Kruskal-Wallis rank sum test

data:  strength by system
Kruskal-Wallis chi-squared = 12.515, df = 3, p-value = 0.005812

Example

Hypotheses
- H_0: \ M_1 = M_2 = M_3 = M_4
- H_1: at least one M_i is different
Test Statistic and p-Value
- \chi_0^2 = 12.515
- p = 0.006
Rejection Region
- Reject H_0 if p < \alpha; \alpha=0.01.
Conclusion/Interpretation
- Reject H_0. There is sufficient evidence to suggest that there is a difference in strength between the four systems.

Kruskal-Wallis: Posthoc Testing

We can also perform posthoc testing in the Kruskal-Wallis setting.
The set up is just like Tukey’s – we can perform all pairwise comparisons and control for the Type I error rate.
Instead of using |\bar{y}_i - \bar{y}_j|, we will use |\bar{R}_i - \bar{R}_j|, where \bar{R}_i is the average rank of group i.
The comparison we are making:
- We declare M_i \ne M_j if |\bar{R}_i - \bar{R}_j| \ge KW, where KW = \frac{q_{\alpha}(k, \infty)}{\sqrt{2}} \sqrt{\frac{n(n+1)}{12} \left( \frac{1}{n_i} + \frac{1}{n_j} \right)} and q_{\alpha}(k, \infty) is the critical value from the Studentized range distribution.

Kruskal-Wallis: Posthoc Testing

We will use the kruskalmc() function from the pgirmess package to perform the Kruskal-Wallis post-hoc test.

kruskalmc(continuous_variable ~ grouping_variable, 
          data = dataset_name)

In our example,

library(pgirmess) 
kruskalmc(strength ~ system, 
          alpha = 0.01,
          data = data)

Multiple comparison test after Kruskal-Wallis 
alpha: 0.01 
Comparisons
                 obs.dif critical.dif stat.signif
Ceramic-Cimara       0.2      11.7637       FALSE
Ceramic-Cojet        7.9      11.7637       FALSE
Ceramic-Silistor    10.3      11.7637       FALSE
Cimara-Cojet         8.1      11.7637       FALSE
Cimara-Silistor     10.5      11.7637       FALSE
Cojet-Silistor       2.4      11.7637       FALSE

Which pairs are significantly different?

Wrap Up

Today we have talked about assessing ANOVA assumptions and performing the nonparametric alternative, the Kruskal-Wallis.
Per usual, we should only look at posthoc testing when we’ve detected an overall difference with the Kruskal-Wallis.
Next lecture: two-way ANOVA.

ANOVA Assumptions and Kruskal-Wallis

Introduction: ANOVA Assumptions

ANOVA Assumptions: Definition

ANOVA Assumptions: Definition

ANOVA Assumptions: Graphical Assessment

ANOVA Assumptions: Graphical Assessment

ANOVA Assumptions: Graphical Assessment

ANOVA Assumptions: Assessing Graphically

ANOVA Assumptions: Test for Variance

ANOVA Assumptions: Test for Variance

ANOVA Assumptions: Test for Variance

ANOVA Assumptions: Test for Variance

Introduction: Kruskal-Wallis

Kruskal-Wallis Test

Kruskal-Wallis Test

Kruskal-Wallis Test

Kruskal-Wallis Test

Example

Kruskal-Wallis: Posthoc Testing

Kruskal-Wallis: Posthoc Testing

Wrap Up

ANOVA Assumptions and
Kruskal-Wallis