t Test Assumptions

Introduction: Topics

Assumptions on t-tests:
- One-sample
  - Normality
- Independent samples
  - Normality
  - Equal variance
- Dependent samples
  - Normality

Introduction: Normality Assumption

All t-tests assume approximate normality of the data.
- In the case of one-sample t-tests, the measure of interest must somewhat follow a normal distribution.
- In the case of two-sample t-tests, the measure of interest in each group must somewhat follow a normal distribution.
Note that a paired t-test is technically a one-sample t-test, so we will examine normality of the difference.

Normality Assumption: Quantile-Quantile Plots

There are formal tests for normality (see article here), however, we will not use them.
- Tests for normality are not well-endorsed by statisticians.
Instead, we will assess normality using a quantile-quantile (Q-Q) plot.
A Q-Q plot helps us visually check if our data follows a specific distribution (here, the normal).
- It compares the quantiles of our sample data to the quantiles of a theoretical distribution (the normal).
How do we read Q-Q plots?
- Each dot represents one observation in our dataset.
- If the data follow a normal distribution, we will observe that the dots fall roughly along a straight diagonal line.
- We focus on the “middle” of the graph.

Normality Assumption: Quantile-Quantile Plots

Normality Assumption: Quantile-Quantile Plots

Normality Assumption: Quantile-Quantile Plots

Normality Assumption: Quantile-Quantile Plots

Normality Assumption: Independent Means

Recall our example from last lecture:
- In the skies above Cloudsdale, Pegasus trainers believe that an average healthy Pegasus flaps its wings 50 flaps per minute when cruising. To see if today’s young Pegasi conform to that standard, a researcher samples 25 Pegasi at the Cloudsdale Training Grounds and measures each pony’s wing‐flap rate (in flaps/minute).

Normality Assumption: Independent Means

Further, we performed a two-sample t-test to determine if the above target pegasi are eating 5 or more apples than the below target pegasi (\alpha=0.05).

wing_flap %>% independent_mean_HT(grouping = target,
                                  continuous = apples, 
                                  mu = 5, 
                                  alternative = "greater", 
                                  alpha = 0.05)

Two-sample t-test for two independent means and equal variance:
Null: H₀: μ₁ − μ₂ ≤ 5
Alternative: H₁: μ₁ − μ₂ > 5
Test statistic: t(23) = 5.445
p-value: p < 0.001
Conclusion: Reject the null hypothesis (p = < 0.001 < α = 0.05)

Are our results valid?

Normality Assumption: Independent Means (R)

We will use the independent_qq() function from library(ssstats) to assess normality.

dataset_name %>% independent_qq(continuous = continuous_variable,
                                grouping = grouping_variable)

This will provide the the Q-Q plots and the histograms for the two independent groups under consideration.

Normality Assumption: Independent Means

Let’s now look at the normality assumption for our example.
How should we change the code for our dataset?

dataset_name %>% independent_qq(continuous = continuous_variable,
                                grouping = grouping_variable)

Normality Assumption: Independent Means

Let’s now look at the normality assumption for our example.
How should we change the code for our dataset?

wing_flap %>% independent_qq(continuous = apples,
                             grouping = target)

Normality Assumption: Independent Means

Running the code,

wing_flap %>% independent_qq(continuous = apples,
                             grouping = target)

Introduction: Variance Assumption

In addition to normality, the two-sample t-test assumes equal variance between groups.
- Homoskedastic: same variance
- Heteroskedastic: different variances
We can check this assumption and easily adjust if the assumption is broken.
Graphical method: scatterplot of residuals
Formal method: test for equal variances (Brown-Forsythe-Levine)

Variance Assumption: Residual Plot (R)

We will use the plot_residuals() function from library(ssstats) to graphically assess the assumption of equal variance.

dataset_name %>% plot_residuals(continuous = continuous_variable,
                                grouping = grouping_variable)

We should evaluate and compare the lengths of the resulting “lines”.

Variance Assumption: Residual Plot

Let’s now look at the normality assumption for our example.
How should we change the code for our dataset?

dataset_name %>% plot_residuals(continuous = continuous_variable,
                                grouping = grouping_variable)

Variance Assumption: Residual Plot

Let’s now look at the normality assumption for our example.
Our updated code:

wing_flap %>% plot_residuals(continuous = apples,
                             grouping = target)

Variance Assumption: Residual Plot

Running the code,

wing_flap %>% plot_residuals(continuous = apples,
                             grouping = target)

Hypothesis Testing: Two or More Variances

If we believe the assumption may be violated, we can test for equal variance using the Brown-Forsythe-Levine (BFL) test.
This test is valid for more than two groups (read: we will see it again!)
Hypotheses
- H_0: \ \sigma^2_1 = \sigma^2_2 = ... = \sigma^2_k
- H_1: at least one \sigma^2_i is different.

Hypothesis Testing: Two or More Variances

Test Statistic:

F_0 = \frac{\sum_{i=1}^k n_i(\bar{z}_{i.}-\bar{z}_{..})^2/(k-1)}{\sum_{i=1}^k \sum_{j=1}^{n_i} (z_{ij}-\bar{z}_{i.})^2/(N-k)} \sim F_{\text{df}_{\text{num}}, \text{df}_{\text{den}}}

where
- n_i is the sample size of group i
- \bar{z}_{i.} is the median of group i and \bar{z}_{..} is the grand median
- k is the number of groups
- N = \sum_{i=1}^k n_i is the overall sample size
- \text{df}_{\text{num}} = k-1, \text{df}_{\text{den}} = (N-k).

Hypothesis Testing: Two or More Variances

Note that the BFL is a one-tailed test, which is different than when we are testing means using the t distribution.
p-value:

p = P\left[F_{\text{df}_{\text{num}}, \text{df}_{\text{den}}} \ge F_0\right]

Hypothesis Testing: Two or More Variances (R)

We will use the variances_HT() function from library(ssstats).

dataset_name %>% variances_HT(continuous = continuous_variable,
                              grouping = grouping_variable)

Hypothesis Testing: Two or More Variances

Let’s now test the variance assumption for our example.
How should we change the code for our dataset?

dataset_name %>% variances_HT(continuous = continuous_variable,
                              grouping = grouping_variable)

Hypothesis Testing: Two or More Variances

Let’s now test the variance assumption for our example.
Our updated code is:

wing_flap %>% variances_HT(continuous = apples,
                           grouping = target)

Hypothesis Testing: Two or More Variances

Running the code,

wing_flap %>% variances_HT(continuous = apples,
                           grouping = target)

Brown-Forsythe-Levene test for equality of variances:
Null: σ²_Above = σ²_Below 
Alternative: At least one variance is different 
Test statistic: F(1,23) = 0.063 
p-value: p = 0.804
Conclusion: Fail to reject the null hypothesis (p = 0.8045 ≥ α = 0.05)

Hypothesis Testing: Two or More Variances

Hypotheses:
- H_0: \ \sigma^2_{\text{Above}} = \sigma^2_{\text{Below}}
- H_1: \ \sigma^2_{\text{Above}} \ne \sigma^2_{\text{Below}}
Test Statistic and p-Value
- F_0 = 0.063, p = 0.805
Rejection Region
- Reject H_0 if p < \alpha; \alpha = 0.10
Conclusion and interpretation
- Fail to reject H_0 (p \text{ vs } \alpha \to 0.805 > 0.10). There is not sufficient evidence to suggest that the variances of those above and below the target are different. That is, the variance assumption holds.

Variance Assumption: Broken – Now What?

What do we do if we have actually broken the variance assumption?
If the normality assumption holds, we can use Satterthwaite’s approximation for degrees of freedom.
- Everything about the two-sample t-test is the same, other than the calculation of the degrees of freedom.

\text{df}=\frac{ \left( \frac{s^2_1}{n_1} + \frac{s_2^2}{n_2} \right)^2 }{ \frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1} }

Remember the original df of \min(n_1-1, n_2-1)…

Confidence Interval: Two Independent Means with Unequal Variances

We will still use the independent_mean_CI function from library(ssstats) to find the confidence interval…
Generic syntax:

dataset_name %>% independent_mean_CI(grouping = grouping_variable,
                                     continuous = continuous_variable, 
                                     confidence = confidence_level,
                                     variance = "unequal")

Note the variance argument at the end.

Hypothesis Testing: Two Independent Means with Unequal Variances

We will use the independent_mean_HT function from library(ssstats) to perform the necessary calculations for the hypothesis test.
Generic syntax:

dataset_name %>% independent_mean_HT(grouping = grouping_variable,
                                     continuous = continuous_variable, 
                                     mu = hypothesized_value, 
                                     alternative = "alternative_direction", 
                                     alpha = specified_alpha,
                                     variance = "unequal")

Note the variance argument at the end.

Comparisons: Two Independent Means

Let’s look at the 90% CI for our example under both ways of calculating degrees of freedom:
Assuming equal variance:

wing_flap %>% independent_mean_CI(continuous = apples,
                                  grouping = target,
                                  confidence = 0.90,
                                  var = "equal")

Assuming unequal variance:

wing_flap %>% independent_mean_CI(continuous = apples,
                                  grouping = target,
                                  confidence = 0.90,
                                  var = "unequal")

Comparisons: Two Independent Means

Let’s look at the 90% CI for our example under both ways of calculating degrees of freedom:
Assuming equal variance:

The point estimate for the difference in means is x̄₁ − x̄₂ = 10.0556
The 90% confidence interval for μ₁ − μ₂ is (8.4642, 11.647)

Assuming unequal variance:

The point estimate for the difference in means is x̄₁ − x̄₂ = 10.0556
The 90% confidence interval for μ₁ − μ₂ is (8.3697, 11.7415)

Comparisons: Two Independent Means

Let’s look at the hypothesis test results for our example under both ways of calculating degrees of freedom (\alpha = 0.10):
Assuming equal variance,

wing_flap %>% independent_mean_HT(continuous = apples,
                                  grouping = target,
                                  confidence = 0.90,
                                  variance = "equal")

Assuming unequal variance,

wing_flap %>% independent_mean_HT(continuous = apples,
                                  grouping = target,
                                  confidence = 0.90,
                                  variance = "unequal")

Comparisons: Two Independent Means

Assuming equal variance,

Two-sample t-test for two independent means and equal variance:
Null: H₀: μ₁ − μ₂ = 0
Alternative: H₁: μ₁ − μ₂ ≠ 0
Test statistic: t(23) = 10.829
p-value: p < 0.001
Conclusion: Reject the null hypothesis (p = < 0.001 < α = 0.1)

Assuming unequal variance,

Two-sample t-test for two independent means and unequal variance:
Null: H₀: μ₁ − μ₂ = 0
Alternative: H₁: μ₁ − μ₂ ≠ 0
Test statistic: t(15.06) = 10.454
p-value: p < 0.001
Conclusion: Reject the null hypothesis (p = < 0.001 < α = 0.1)

Normality Assumption: Dependent Means

Let’s now look at when we are examining dependent means.
- Must be able to link observations using an identifier.
Recall that we are actually examining the difference between the two groups, d = x_1-x_2.
- We want to know about the normality of d and not x_1 and x_2.
- There is no variance assumption with this test.
As a reminder, how do we read Q-Q plots?
- Each dot represents one observation (difference) in our dataset.
- If the data follow a normal distribution, we will observe that the dots fall roughly along a straight diagonal line.
- We focus on the “middle” of the graph.

Normality Assumption: Dependent Means

Recall our example from last lecture:
- Princess Celestia has invited two groups of flyers to take part in a brand-new “SkyStride” aerial training camp. Before the camp begins, each pony perches on a floating platform while a team of Wonderbolt engineers use magical sensors to record their baseline wing-flap rate (flaps per second) as they hover in place (pre_training_wfr).
- Over two weeks, trainees attend identical flight drills: precision loops, cloud-weaving obstacle courses, and high-altitude sprints. At camp’s end, each flyer returns to the sensor platforms for post-training measurements (post_training_wfr).

Normality Assumption: Dependent Means

Further, we performed a dependent t-test to determine if there is a difference in wing-flap rate pre- and post-training (\alpha=0.01).

Paired t-test for the mean of differences:
Null: H₀: μ_d = 0
Alternative: H₁: μ_d ≠ 0
Test statistic: t(24) = -0.859
p-value: p = 0.399
Conclusion: Fail to reject the null hypothesis (p = 0.3991 ≥ α = 0.01)

Are our results valid?

Normality Assumption: Dependent Means (R)

We will use the dependent_qq() function from library(ssstats) to assess normality.

dataset_name %>% dependent_qq(col1 = first_group,
                              col2 = second_group)

This will provide the the Q-Q plot and the histogram for the difference between the two groups.

Normality Assumption: Dependent Means

Let’s now look at the normality assumption for our example.
How should we change the code for our dataset?

dataset_name %>% dependent_qq(col1 = first_group,
                              col2 = second_group)

Normality Assumption: Dependent Means

Let’s now look at the normality assumption for our example.
Our updated code,

wing_flap %>% dependent_qq(col1 = post_training_wfr,
                           col2 = pre_training_wfr)

Normality Assumption: Dependent Means

Running the code,

Wrap Up: t-Test Assumptions

Important note!!
- I do not expect you to agree with my assessment of q-q plots!
- What I do expect is that you know what to do after making your assessment.
Two indepdent means:
- Normality and variance met \to pooled t-test.
- Normality met but variance not met \to Satterthwaite’s approximation for df.

Wrap Up

Today’s lecture:
- Normality assumption (all t) \to Q-Q plot
- Variance assumption (independent t) \to folded F test
Next week:
- M/T: lecture on two-sample medians
  - Wilcoxon rank sum (nonparametric equivalent to independent t)
  - Wilcoxon signed rank (nonparametric equivalent to dependent t)
- W/R: R lab for Wilcoxons