Two-Sample Means

Introduction: Topics

We previously have reviewed estimation and basic statistical inference.
- Point estimation
- Confidence intervals
- Hypothesis testing
Today we will focus on the following scenarios today:
- Two-sample mean, independent data (\mu_1-\mu_2)
- Two-sample mean, dependent data (\mu_d)

Introduction: Data

We are using data from the kingdom of Equestria from My Little Pony.
Mane Six:
- Twilight Sparkle (Unicorn \to Alicorn)
- Applejack (Earth Pony)
- Fluttershy (Pegasus)
- Pinkie Pie (Earth Pony)
- Rainbow Dash (Pegasus)
- Rarity (Unicorn)

Independent Data

Independent data: Observations in one group (or sample) do not influence or relate to observations in another group.
- Examples:
  - Comparing the cruising speeds of a random sample of Pegasi vs. a random sample of Unicorns flying a short course.
  - Measuring friendship lesson quiz scores for a group of Cutie Mark Crusaders vs. a group of Wonderbolts Cadets.
  - Examining the graduation rates between Unicorns and Pegasi.

Dependent Data

Dependent (paired) data: Each observation in the first sample is paired with exactly one observation in the second sample.
- Examples:
  - Students’ magic‐proficiency scores before and after Princess Celestia’s advanced spell workshop.
  - Applejack’s apple‐yield (in bushels) from Sweet Apple Acres in Spring vs. Fall for the last 10 years.
  - Comparing the “Wonderbolts Tryouts” performance scores for Spitfire and Skyflare (twins).

Independent vs. Dependent Data

Are the following dependent or independent?
1. Rainbow Dash times two separate groups, Pegasi trainees and Unicorn cadets, on the same 200-meter aerial course.
2. Twilight Sparkle measures her own spell‐casting accuracy before and after attending Princess Celestia’s advanced magic workshop.
3. Applejack records bushel counts from Sweet Apple Acres in spring this year and compares them to bushel counts from Sugarcube’s orchard over the same period.
4. The Cutie Mark Crusaders each take a friendship-lesson quiz, and their scores are compared to a completely different group of ponies at the School of Friendship.
5. Fluttershy records the heart rates of the same group of critters before and after she plays soothing music for them.

Confidence Intervals: Two Independent Means

(1-\alpha)100\% confidence interval for \mu_1-\mu_2:

(\bar{x}_1 - \bar{x}_2) \pm t_{\alpha/2} \sqrt{\frac{s_1^2 }{n_1} + \frac{s_2^2}{n_2}}

where
- (\bar{x}_1-\bar{x}_2) is the point estimate for \mu_1-\mu_2
  - \bar{x}_i is the sample mean for group i
- t_{\alpha/2} has \min(n_1-1, n_2-1) degrees of freedom
- s_i^2 is the sample variance for group i
- n_i is the sample size of group i

Confidence Intervals: Two Independent Means (R)

We will use the independent_mean_CI function from library(ssstats) to find the confidence interval.
Generic syntax:

dataset_name %>% independent_mean_CI(grouping = grouping_variable,
                                     continuous = continuous_variable, 
                                     confidence = confidence_level)

Confidence Intervals: Two Independent Means

The Pegasus trainers insist that a healthy Pony munches through 25 apples per day to stay strong and energetic. Looking for differences between those that are above and below target wing-flap rates, a researcher visits the apple stands at Sweet Apple Acres and records the exact number of apples each of the Pegasi in training eats in a typical day.
Use the wing-flap data to estimate the difference in apple consumption (apples) betwen those that are above or below the target rate (target). Estimate using a 95% confidence interval.
How should we change the following code?

dataset_name %>% independent_mean_CI(grouping = grouping_variable,
                                     continuous = continuous_variable, 
                                     confidence = confidence_level)

Confidence Intervals: Two Independent Means

The Pegasus trainers insist that a healthy Pony munches through 25 apples per day to stay strong and energetic. Looking for differences between those that are above and below target wing-flap rates, a researcher visits the apple stands at Sweet Apple Acres and records the exact number of apples each of the Pegasi in training eats in a typical day.
Use the wing-flap data to estimate the difference in apple consumption (apples) betwen those that are above or below the target rate (target). Estimate using a 95% confidence interval.
Our updated code should look like:

wing_flap %>% independent_mean_CI(grouping = target,
                                  continuous = apples, 
                                  confidence = 0.95)

Confidence Intervals: Two Independent Means

Running the code,

wing_flap %>% independent_mean_CI(grouping = target,
                                  continuous = apples, 
                                  confidence = 0.95)

The point estimate for the difference in means is x̄₁ − x̄₂ = 10.0556
The 95% confidence interval for μ₁ − μ₂ is (8.1347, 11.9764)

Thus, the 95% CI for \mu_{\text{above}} - \mu_{\text{below}} is (8.13, 11.98).
- The pegasi above the target wing-flap rate eat, on average, somewhere between 8 and 12 more apples than those below the target wing-flap rate.

Hypothesis Testing: Two Independent Means

Hypotheses: Two Tailed
- H_0: \ \mu_1-\mu_2=\mu_0
- H_1: \ \mu_1-\mu_2 \ne \mu_0
Hypotheses: Left Tailed
- H_0: \ \mu_1-\mu_2 \ge \mu_0
- H_1: \ \mu_1-\mu_2 < \mu_0
Hypotheses: Right Tailed
- H_0: \ \mu_1-\mu_2 \le \mu_0
- H_1: \ \mu_1-\mu_2 > \mu_0

Hypothesis Testing: Two Independent Means

Test Statistic:

t_0 = \frac{(\bar{x}_1-\bar{x}_2)-\mu_0}{{\sqrt{\frac{s_1^2}{n_1} + \frac{s^2_2}{n_2}}}}

where
- \bar{x}_i is the mean for group i
- \mu_0 is the hypothesized difference
- s_i^2 is the sample variance for group i
- n_i is the sample size of group i
- \text{df} = \text{min}(n_1-1, n_2-1)

Hypothesis Testing: Two Independent Means

p-Value:

p-value: Two Tailed

p = 2\times P\left[t_{\text{df}} \ge |t_0|\right]

p-value: Left Tailed

p = P\left[t_{\text{df}} \le t_0\right]

p-value: Right Tailed

p = P\left[t_{\text{df}} \ge t_0\right]

Hypothesis Testing: Two Independent Means (R)

We will use the independent_mean_HT function from library(ssstats) to perform the necessary calculations for the hypothesis test.
Generic syntax:

dataset_name %>% independent_mean_HT(grouping = grouping_variable,
                                     continuous = continuous_variable, 
                                     mu = hypothesized_value, 
                                     alternative = "alternative_direction", 
                                     alpha = specified_alpha)

For the entered variable (continuous), we will see:
- Hypotheses (based on hypothesized_value and alternative)
- Test statistic and p-value
- Conclusion
Note! When looking at the grouping variable, R will subtract in alphabetic/numeric order.

Hypothesis Testing: Two Independent Means

Perform the appropriate hypothesis test to determine if the above target pegasi are eating 5 or more apples than the below target pegasi. Test at the \alpha=0.05 level.
What is the direction of the test? How do you know?
What is the hypothesized value? How do you know?
What are the corresponding hypotheses?

Hypothesis Testing: Two Independent Means

Perform the appropriate hypothesis test to determine if the above target pegasi are eating 5 or more apples than the below target pegasi. Test at the \alpha=0.05 level.
How should we change the following code?

dataset_name %>% independent_mean_HT(grouping = grouping_variable,
                                     continuous = continuous_variable, 
                                     mu = hypothesized_value, 
                                     alternative = "alternative_direction", 
                                     alpha = specified_alpha)

Hypothesis Testing: Two Independent Means

Perform the appropriate hypothesis test to determine if the above target pegasi are eating 5 or more apples than the below target pegasi. Test at the \alpha=0.05 level.
Our updated code should look like:

wing_flap %>% independent_mean_HT(grouping = target,
                                  continuous = apples, 
                                  mu = 5, 
                                  alternative = "greater", 
                                  alpha = 0.05)

Hypothesis Testing: Two Independent Means

Running the code,

wing_flap %>% independent_mean_HT(grouping = target,
                                  continuous = apples, 
                                  mu = 5, 
                                  alternative = "greater", 
                                  alpha = 0.05)

Two-sample t-test for two independent means and equal variance:
Null: H₀: μ₁ − μ₂ ≤ 5
Alternative: H₁: μ₁ − μ₂ > 5
Test statistic: t(23) = 5.445
p-value: p < 0.001
Conclusion: Reject the null hypothesis (p = < 0.001 < α = 0.05)

Hypothesis Testing: Two Independent Means

Hypotheses:
- H_0: \ \mu_{\text{above}} - \mu_{\text{below}} \le 5
- H_1: \ \mu_{\text{above}} - \mu_{\text{below}} > 5
Test Statistic and p-Value
- t_0 = 5.445, p < 0.001
Rejection Region
- Reject H_0 if p < \alpha; \alpha = 0.05
Conclusion and interpretation
- Reject H_0 (p \text{ vs } \alpha \to p < 0.001 < 0.05). There is sufficient evidence to suggest that ponies above target, on average, eat 5 more apples than those below target.

Two Dependent Means: Summary Statistics

We are now interested in comparing two dependent groups.
We assume that the two groups come from the same population and are going to examine the difference,

d = y_{i, 1} - y_{i, 2}

After drawing samples, we have the following,
- \bar{d} estimates \mu_d,
- s^2_d estimates \sigma^2_d, and
- n is the sample size.

Two Dependent Means: Summary Statistics (R)

We will use the dependent_mean_median function from library(ssstats) to find the summary statistics for this data.
Generic syntax:

dataset_name %>% dependent_mean_median(col1 = first_variable,
                                       col2 = second_variable)

Note that this will compute summary statistics for:
- x_d = x_1-x_2
- x_1
- x_2

Two Dependent Means: Summary Statistics

Princess Celestia has invited two groups of flyers to take part in a brand-new “SkyStride” aerial training camp. Before the camp begins, each pony perches on a floating platform while a team of Wonderbolt engineers use magical sensors to record their baseline wing-flap rate (flaps per second) as they hover in place (pre_training_wfr).
Over two weeks, trainees attend identical flight drills: precision loops, cloud-weaving obstacle courses, and high-altitude sprints. At camp’s end, each flyer returns to the sensor platforms for post-training measurements (post_training_wfr).
Let’s find the summary statistics. How should this code be edited?

dataset_name %>% dependent_mean_median(col1 = first_variable,
                                       col2 = second_variable)

Two Dependent Means: Summary Statistics

Princess Celestia has invited two groups of flyers to take part in a brand-new “SkyStride” aerial training camp. Before the camp begins, each pony perches on a floating platform while a team of Wonderbolt engineers use magical sensors to record their baseline wing-flap rate (flaps per second) as they hover in place (pre_training_wfr).
Over two weeks, trainees attend identical flight drills: precision loops, cloud-weaving obstacle courses, and high-altitude sprints. At camp’s end, each flyer returns to the sensor platforms for post-training measurements (post_training_wfr).
Our code is as follows:

wing_flap %>% dependent_mean_median(col1 = pre_training_wfr, 
                                    col2 = post_training_wfr)

Two Dependent Means: Summary Statistics

Running the code,

wing_flap %>% dependent_mean_median(col1 = pre_training_wfr, 
                                    col2 = post_training_wfr)

Confidence Intervals: Two Dependent Means

\mathbf{(1-\boldsymbol\alpha)100\%} confidence interval for \mathbf{\boldsymbol\mu_d}

\bar{d} \pm t_{\alpha/2} \frac{s_d}{\sqrt{n}}

where
- \bar{d} = \text{mean}(x_1-x_2) is the point estimate for \mu_d = \mu_1-\mu_2
- t_{\alpha/2} has n-1 degrees of freedom
- s_d is the sample standard deviation of the difference
- n is the number of pairs of observations

Confidence Intervals: Two Dependent Means (R)

We will use the dependent_mean_CI function from library(ssstats) to find the confidence interval.
Generic syntax:

dataset_name %>% dependent_mean_CI(col1 = first_group,
                                   col2 = second_group,
                                   confidence = confidence_level)

Confidence Intervals: Two Dependent Means

We now want to find the 99% CI for the average improvement in wing-flap rate.
- Hint: improvement can be measured with post - pre.
- Hint 2: post measurement: post_training_wfr, pre measurement: pre_training_wfr.
How should we change the following code?

dataset_name %>% dependent_mean_CI(col1 = first_group,
                                   col2 = second_group,
                                   confidence = confidence_level)

Confidence Intervals: Two Dependent Means

We now want to find the 99% CI for the average improvement in wing-flap rate.
- Hint: improvement can be measured with post - pre.
- Hint 2: post measurement: post_training_wfr, pre measurement: pre_training_wfr.
Our updated code should look like:

wing_flap %>% dependent_mean_CI(col1 = post_training_wfr,
                                col2 = pre_training_wfr,
                                confidence = 0.99)

Confidence Intervals: Two Dependent Means

Running the code,

wing_flap %>% dependent_mean_CI(col1 = post_training_wfr,
                                col2 = pre_training_wfr,
                                confidence = 0.99)

The point estimate for the mean difference is x̄ = -1.712.
The point estimate for the standard deviation of differences is s = 9.9701.
The 99% confidence interval for the mean difference μ_d is (-7.2891, 3.8651).

The 99% confidence interval for \mu_d is (-7.16, 5.33).

Confidence Intervals: Two Dependent Means

What happens if we flip the order?

wing_flap %>% dependent_mean_CI(col1 = pre_training_wfr,
                                col2 = post_training_wfr,
                                confidence = 0.99)

Confidence Intervals: Two Dependent Means

What happens if we flip the order?

wing_flap %>% dependent_mean_CI(col1 = pre_training_wfr,
                                col2 = post_training_wfr,
                                confidence = 0.99)

The point estimate for the mean difference is x̄ = 1.712.
The point estimate for the standard deviation of differences is s = 9.9701.
The 99% confidence interval for the mean difference μ_d is (-3.8651, 7.2891).

The 99% confidence interval for \mu_d is (-5.33, 7.16).

Confidence Intervals: Two Dependent Means

When looking at post - pre, the CI was (-7.16, 5.33).
When looking at pre - post, the CI was (-5.33, 7.16).
What is the relationship?
Why does the order matter?

Hypothesis Testing: Two Dependent Means

Hypotheses: Two Tailed
- H_0: \ \mu_d=\mu_0
- H_1: \ \mu_d \ne \mu_0
Hypotheses: Left Tailed
- H_0: \ \mu_d \ge \mu_0
- H_1: \ \mu_d < \mu_0
Hypotheses: Right Tailed
- H_0: \ \mu_d \le \mu_0
- H_1: \ \mu_d > \mu_0
Note! \mu_d = \mu_1 - \mu_2

Hypothesis Testing: Two Dependent Means

Test statistic:

t_0 = \frac{\bar{d}-\mu_0}{\frac{s_d}{\sqrt{n}}} \sim t_{\text{df}}

where
- \bar{d} = \text{mean}(x_1-x_2) is the point estimate for \mu_d = \mu_1-\mu_2
- \mu_0 is the hypothesized difference
- s_d is the sample standard deviation of the difference
- n is the number of pairs of observations
- \text{df} = n-1

Hypothesis Testing: Two Dependent Means

p-Value:

p-value: Two Tailed

p = 2\times P\left[t_{\text{df}} \ge |t_0|\right]

p-value: Left Tailed

p = P\left[t_{\text{df}} \le t_0\right]

p-value: Right Tailed

p = P\left[t_{\text{df}} \ge t_0\right]

Hypothesis Testing: Two Dependent Means (R)

We will use the dependent_mean_HT function from library(ssstats) to perform the necessary calculations for the hypothesis test.
Generic syntax:

dataset_name %>% dependent_mean_HT(col1 = first_group,
                                   col2 = second_group,
                                   alternative = "alternative_direction",
                                   mu = hypothesized_diff,
                                   alpha = alpha_level)

Hypothesis Testing: Two Dependent Means

Perform the appropriate hypothesis test to determine if there is a difference in wing-flap rate pre- and post-training. Test at the \alpha=0.01 level.
What is the direction of the test? How do you know?
What is the hypothesized value? How do you know?
What are the corresponding hypotheses?

Hypothesis Testing: Two Dependent Means

Perform the appropriate hypothesis test to determine if there is a difference in wing-flap rate pre- and post-training. Test at the \alpha=0.01 level.
How should we change the following code?

dataset_name %>% dependent_mean_HT(col1 = first_group,
                                   col2 = second_group,
                                   alternative = "alternative_direction",
                                   mu = hypothesized_diff,
                                   alpha = alpha_level)

Hypothesis Testing: Two Dependent Means

Perform the appropriate hypothesis test to determine if there is a difference in wing-flap rate pre- and post-training. Test at the \alpha=0.01 level.
Our updated code should look like:

wing_flap %>% dependent_mean_HT(col1 = post_training_wfr,
                                col2 = pre_training_wfr,
                                alpha = 0.01)

Hypothesis Testing: Two Dependent Means

Running the code,

wing_flap %>% dependent_mean_HT(col1 = post_training_wfr,
                                col2 = pre_training_wfr,
                                alpha = 0.01)

Paired t-test for the mean of differences:
Null: H₀: μ_d = 0
Alternative: H₁: μ_d ≠ 0
Test statistic: t(24) = -0.859
p-value: p = 0.399
Conclusion: Fail to reject the null hypothesis (p = 0.3991 ≥ α = 0.01)

Hypothesis Testing: Two Dependent Means

Hypotheses:
- H_0: \ \mu_{d} = 0, where \mu_d = \mu_{\text{pre}} - \mu_{\text{post}}
- H_1: \ \mu_d \ne 0
Test Statistic and p-Value
- t_0 = -0.859, p = 0.399
Rejection Region
- Reject H_0 if p < \alpha; \alpha = 0.01
Conclusion and interpretation
- Fail to reject H_0 (p \text{ vs } \alpha \to p = 0.399 > 0.05). There is not sufficient evidence to suggest that training changed wing-flap rates.

Wrap Up

Today’s lecture:
- Independent t-test
- Dependent t-test
Next week:
- Monday/Tuesday: lab
  - Quiz: two-sample means
- Wednesday/Thursday: lecture
  - Assumptions on t-tests
  - Alternatives to t-tests