STA4173: Biostatistics
Spring 2025
We have previously discussed testing the difference between two groups.
We will use a method called analysis of variance (ANOVA).
Fun fact: the two-sample t-test is a special case of ANOVA.
Hypotheses all take the same form:
Note 1: you must fill in the “k” when writing hypotheses!
e.g., if there are four means, your hypotheses are
Note 2: ANOVA does not tell us which means are different, just if a general difference exists!
The computations for ANOVA are more involved than what we’ve seen before.
An ANOVA table will be constructed in order to perform the hypothesis test.
Source | Sum of Squares | df | Mean Squares | F |
---|---|---|---|---|
Treatment | SSTrt | dfTrt | MSTrt | F0 |
Error | SSE | dfE | MSE | |
Total | SSTot | dfTot |
Once this is put together, we can perform the hypothesis test.
The F distribution is derived as the ratio of two variances.
The F distribution’s shape depends on the df,
Source | Sum of Squares | df | Mean Squares | F |
---|---|---|---|---|
Treatment | SSTrt | dfTrt | MSTrt | F0 |
Error | SSE | dfE | MSE | |
Total | SSTot | dfTot |
We are partitioning the variance of our outcome into:
Variance due to the grouping (treatment)
Variance due to “other” factors (error)
\bar{x}, \ \ n_i, \ \ \bar{x}_i, \ \ s_i^2
\begin{align*} \text{SS}_{\text{Trt}} &= \sum_{i=1}^k n_i(\bar{x}_i-\bar{x})^2 \\ \text{SS}_{\text{E}} &= \sum_{i=1}^k (n_i-1)s_i^2 \\ \text{SS}_{\text{Tot}} &= \text{SS}_{\text{Trt}} + \text{SS}_{\text{E}} \end{align*}
Once we have the sum of squares and corresponding degrees of freedom, we have the mean squares.
Generally, mean squares are the sum of square divided by the df, \text{MS}_X = \frac{\text{SS}_X}{\text{df}_X}
In the case of one-way ANOVA, \begin{align*} \text{MS}_{\text{Trt}} &= \frac{\text{SS}_{\text{Trt}}}{\text{df}_{\text{Trt}}} \\ \text{MS}_{\text{E}} &= \frac{\text{SS}_{\text{E}}}{\text{df}_{\text{E}}} \end{align*}
Finally, we have the test statistic.
Generally, we construct an F for ANOVA by dividing the MS of interest by MS_{\text{E}}, F_X = \frac{\text{MS}_X}{\text{MS}_{\text{E}}}
In one-way ANOVA, we are only constructing the F for treatment, F_0 = \frac{\text{MS}_{\text{Trt}}}{\text{MS}_{\text{E}}}
Source | Sum of Squares | df | Mean Squares | F |
---|---|---|---|---|
Treatment | SSTrt | dfTrt | MSTrt | F0 |
Error | SSE | dfE | MSE | |
Total | SSTot | dfTot |
aov()
and summary()
functions.lm()
to define the model and anova()
to construct the ANOVA table.Prosthodontists specialize in the restoration of oral function, including the use of dental implants, veneers, dentures, and crowns. A researcher wanted to compare the shear bond strength of different repair kits for repairs of chipped porcelain veneer.
He randomly divided 20 porcelain specimens into four treatment groups: group 1 used the Cojet system, group 2 used the Silistor system, group 3 used the Cimara system, and group 4 used the Ceramic Repair system.
At the conclusion of the study, shear bond strength (in megapascals, MPa) was measured according to ISO 10477. The data are as follows,
What is the continuous variable?
What is the grouping variable?
Df Sum Sq Mean Sq F value Pr(>F)
system 3 200.0 66.66 7.545 0.00229 **
Residuals 16 141.4 8.84
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Hypotheses
Test Statistic
p-Value
Rejection Region
Df Sum Sq Mean Sq F value Pr(>F)
system 3 200.0 66.66 7.545 0.00229 **
Residuals 16 141.4 8.84
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Today we have introduced ANOVA. Recall the hypotheses,
The F test does not tell us which mean is different… only that a difference exists.
In theory, we could perform repeated t tests to determine pairwise differences.
Recall that ANOVA is an extension of the t test… or that the t test is a special case of ANOVA.
However, this will increase the Type I error rate (\alpha).
Recall that the Type I error rate, \alpha, is the probability of incorrectly rejecting H_0.
Suppose we are comparing 5 groups.
This is 10 pairwise comparisons!!
If we perform repeated t tests under \alpha=0.05, we are inflating the Type I error to 0.40! 😵
When performing posthoc comparisons, we can choose one of two paths:
Note that controlling the Type I error rate is more conservative than when we do not control it.
Generally, statisticians:
do not control the Type I error rate if examining the results of pilot/preliminary studies that are exploring for general relationships.
do control the Type I error rate if examining the results of confirmatory studies and are attempting to confirm relationships observed in pilot/preliminary studies.
The posthoc tests we will learn:
Tukey’s test
Fisher’s least significant difference
Dunnett’s test
Caution: we should only perform posthoc tests if we have determined that a general difference exists!
Df Sum Sq Mean Sq F value Pr(>F)
system 3 200.0 66.66 7.545 0.00229 **
Residuals 16 141.4 8.84
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Tukey’s test allows us to do all pairwise comparisons while controlling \alpha.
The underlying idea of the comparison:
We declare \mu_i \ne \mu_j if |\bar{y}_i - \bar{y}_j| \ge W, where W = \frac{q_{\alpha}(k, \text{df}_{\text{E}})}{\sqrt{2}} \sqrt{\text{MSE} \left( \frac{1}{n_i} + \frac{1}{n_j} \right)}
We will use the TukeyHSD()
function.
aov()
function. diff lwr upr p adj
Cimara-Ceramic -0.14 -7.04151507 6.761515 0.999845202
Cojet-Ceramic 5.50 -1.40151507 12.401515 0.044147158
Silistor-Ceramic 6.86 -0.04151507 13.761515 0.010458208
Cojet-Cimara 5.64 -1.26151507 12.541515 0.038206781
Silistor-Cimara 7.00 0.09848493 13.901515 0.008990873
Silistor-Cojet 1.36 -5.54151507 8.261515 0.886304336
Fisher’s allows us to test all pairwise comparisons but control the \alpha.
The underlying idea of the comparison:
We will use the LSD.test()
function from the agricolae
package.
aov()
function.library(agricolae)
results <- summary(m)
(LSD.test(dataset_name$continuous_variable, # continuous outcome
dataset_name$grouping_variable, # grouping variable
results[[1]]$Df[2], # df_E
results[[1]]$`Mean Sq`[2], # MSE
alpha = alpha_level) # can omit if alpha = 0.05
)[5] # limit to only the pairwise comparison results
library(agricolae)
results <- summary(m)
LSD.test(data$strength,
data$system,
results[[1]]$Df[2],
results[[1]]$`Mean Sq`[2],
alpha = 0.01)[5]
$groups
data$strength groups
Silistor 17.64 a
Cojet 16.28 a
Ceramic 10.78 b
Cimara 10.64 b
Dunnett’s test allows us to do all pairwise comparisons against only the control, while controlling \alpha.
This has fewer comparisons than Tukey’s because we are not comparing non-control groups to one another.
i.e., we are sharing the \alpha between fewer comparisons now, which is preferred if we are not interested in the comparisons between non-control groups.
The underlying idea of the comparison:
We declare \mu_i \ne \mu_j if |\bar{y}_i - \bar{y}_j| \ge D, where D = d_{\alpha}(k-1, \text{df}_{\text{E}}) \sqrt{\text{MSE} \left( \frac{1}{n_i} + \frac{1}{n_c} \right)},
DunnettTest()
function from the DescTools package to perform Dunnett’s test.Let’s apply Dunnett’s to the dental data.
Dunnett's test for comparing several treatments with a control :
95% family-wise confidence level
$Ceramic
diff lwr.ci upr.ci pval
Cimara-Ceramic -0.14 -5.0138317 4.733832 0.9997
Cojet-Ceramic 5.50 0.6261683 10.373832 0.0258 *
Silistor-Ceramic 6.86 1.9861683 11.733832 0.0058 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1