t-Test Assumptions

STA4173: Biostatistics
Spring 2025

Introduction: Assumptions

We have now learned one- and two-sample t-tests.
Recall, when we have two samples, they can be independent samples or dependent samples.
- Independent samples: two-sample t-test
- Dependent samples: paired t-test (one-sample t-test on difference)
Today we will discuss how to assess the assumptions on t-tests.

Normality Assumption: Set Up

All t-tests assume approximate normality of the data.
- In the case of one-sample t-tests, the measure of interest must somewhat follow a normal distribution.
- In the case of two-sample t-tests, the measure of interest in each group must somewhat follow a normal distribution.
Note that a paired t-test is technically a one-sample t-test, so we will examine normality of the difference.

Normality Assumption: Set Up

There are formal tests for normality (see article here), however, we will not use them.
- Tests for normality are not well-endorsed by statisticians.
Instead, we will assess normality using a quantile-quantile (q-q) plot.
- This is a scatterplot that will form a 45° line if the assumed distribution is correct.
- Here is more information about q-q plots.
We will create q-q plots for:
- The measurements in the case of the one-sample t-test.
- The measurements from each group in the case of the two-sample t-test.
- The difference between the groups in the case of the paired t-test.

Normality Assumption: R Syntax

We will assess the normality assumption graphically using a q-q plot
A package was written by a former student, classpackage.
- If you are working on the server, the package is already installed.
- If you are not working on the server, please ask me for the code needed to install.

Normality Assumption: Independent Data - R Syntax

Once installed, we call the package,

library(classpackage)

While there are several functions in this package, we are currently interested in the independent_qq_plot() function.

dataset_name %>% independent_qq_plot(variable = "continuous variable",
                                     grouping_variable = "grouping variable")

This will provide the the q-q plot for the two-sample t-test (i.e., for independent data).

Normality Assumption: Independent Data - Example

Recall the penguin example for the two-sample t-test.
- Is the body mass different for males and females?

penguins <- palmerpenguins::penguins
head(penguins, n=3)

Requesting the q-q plot,

penguins %>% independent_qq_plot(variable = "body_mass_g",
                                 grouping_variable = "sex")

Normality Assumption: Independent Data - Example

Normality Assumption: Dependent Data - R Syntax

While there are several functions in the classpackage package, we are now interested in the dependent_qq_plot() function.

wide_data %>% dependent_qq_plot(variable = "Display Name of Continuous Variable",
                                grouping_variable = " ", # do not edit this line
                                first_group = "first_variable", # first column for comparison
                                second_group = "second_variable") # second column for comparison

This will provide the the q-q plot for the paired t-test (i.e., for dependent data).

Normality Assumption: Repair Estimates

Recall the repair estimate example for the dependent t-test.

garage <- tibble(g1 = c(17.6, 20.2, 19.5, 11.3, 13.0, 
                        16.3, 15.3, 16.2, 12.2, 14.8,
                        21.3, 22.1, 16.9, 17.6, 18.4), 
                 g2 = c(17.3, 19.1, 18.4, 11.5, 12.7, 
                        15.8, 14.9, 15.3, 12.0, 14.2, 
                        21.0, 21.0, 16.1, 16.7, 17.5))

Requesting the q-q plot,

garage %>% dependent_qq_plot(variable = "estimate",
                             grouping_variable = "garage",
                             first_group = "g1",
                             second_group = "g2")

Normality Assumption: Repair Estimates

Wrap Up

Important note!!
- I do not expect you to agree with my assessment of q-q plots!
- What I do expect is that you know what to do after making your assessment.
Next up:
- What happens if we do not meet the assumption for a t-test….?