STA4173: Biostatistics
Spring 2025
Before today, we have focused on continuous outcomes.
Now we will focus on categorical (or qualitative) outcomes.
Today, we will review how to test one and two sample proportions.
We will estimate a proportion using \hat{p},
\hat{p} = \frac{x}{n}
\hat{p_1}- \hat{p_2} = \frac{x_1}{n_1} - \frac{x_2}{n_2}
z_0 = \frac{\hat{p}-p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}
\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}
We will use either the binom.test()
function or the prop.test()
function.
If we have n \le 30,
Humira is a medication used to treat rheumatoid arthritis (RA). In clinical trials of Humira, 705 subjects diagnosed with RA were administered 40 mg of Humira every other week. Of the 705 subjects, 66 reported nausea as a side effect. It is known that the proportion of RA subjects in similar studies receiving a placebo who report nausea as a side effect is 0.08. Does the sample evidence represent significant evidence that a higher proportion of subjects receiving Humira experience nausea as a side effect than those taking a placebo? Test at the \alpha = 0.05 level of significance.
What are the important pieces?
Humira is a medication used to treat rheumatoid arthritis (RA). In clinical trials of Humira, 705 subjects diagnosed with RA were administered 40 mg of Humira every other week. Of the 705 subjects, 66 reported nausea as a side effect. It is known that the proportion of RA subjects in similar studies receiving a placebo who report nausea as a side effect is 0.08. Does the sample evidence represent significant evidence that a higher proportion of subjects receiving Humira experience nausea as a side effect than those taking a placebo? Test at the \alpha = 0.05 level of significance.
What is the point estimate, \hat{p}?
What is the 95% confidence interval for p?
Are there a higher proportion of subjects taking Humira experiencing nausea as a side effect than those taking a placebo?
1-sample proportions test without continuity correction
data: 66 out of 705, null probability 0.5
X-squared = 465.71, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.07426251 0.11737620
sample estimates:
p
0.09361702
1-sample proportions test without continuity correction
data: 66 out of 705, null probability 0.08
X-squared = 1.7761, df = 1, p-value = 0.09131
alternative hypothesis: true p is greater than 0.08
95 percent confidence interval:
0.07709288 1.00000000
sample estimates:
p
0.09361702
Hypotheses:
Test Statistic and p-Value
Rejection Region
Conclusion / Interpretation
Fail to reject H_0.
There is not sufficient evidence to suggest that the proportion of subjects taking Humira who experience nausea is greater than 0.08.
Which do you think is easier to raise – a boy or a girl? When asked this question in 1947, 24% of all Americans said raising a girl was easier. In June 2018, the Gallup Organization surveyed 1500 adult Americans, of which 408 felt it was easier to raise a girl. Does this result suggest the proportion of adult Americans who believe it is easier to raise a girl has changed since 1947? Test at the \alpha=0.10 level.
What are the important pieces?
Which do you think is easier to raise – a boy or a girl? When asked this question in 1947, 24% of all Americans said raising a girl was easier. In June 2018, the Gallup Organization surveyed 1500 adult Americans, of which 408 felt it was easier to raise a girl. Does this result suggest the proportion of adult Americans who believe it is easier to raise a girl has changed since 1947? Test at the \alpha=0.10 level.
What are the important pieces?
1-sample proportions test without continuity correction
data: 408 out of 1500, null probability 0.24
X-squared = 8.4211, df = 1, p-value = 0.003709
alternative hypothesis: true p is not equal to 0.24
95 percent confidence interval:
0.2500845 0.2950804
sample estimates:
p
0.272
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion / Interpretation
Reject H_0.
There is sufficient evidence to suggest that the proportion of adult Americans who believe that it is easier to raise a girl has changed since 1947.
z_0 = \frac{\left( \hat{p}_1 - \hat{p}_2 \right)- d_0}{\sqrt{\hat{p}\left(1-\hat{p}\right)\left( \frac{1}{n_1}+\frac{1}{n_2} \right)}}
\hat{p}_1 = \frac{x_1}{n_1}, \ \ \ \hat{p}_2 = \frac{x_2}{n_2}, \ \ \ \hat{p} = \frac{x_1+x_2}{n_1+n_2}
(\hat{p}_1 - \hat{p}_2) \pm z_{\alpha/2} \sqrt{\frac{\hat{p}_1 (1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}
\hat{p}_1-\hat{p}_2=\frac{x_1}{n_1} - \frac{x_2}{n_2}
and
To construct this interval, we require:
prop.test()
function.In clinical trials of Nasonex, 3774 adult and adolescent allergy patients (patients 12 years and older) were randomly divided into two groups.
The patients in group 1 (experimental group) received 200 \mug of Nasonex.
The patients in group 2 (control group) received a placebo.
Is there evidence to conclude that the proportion of Nasonex users who experienced headaches as a side effect is greater than the proportion in the control group?
Test at the \alpha= 0.05 level of significance.
What are the important pieces?
In clinical trials of Nasonex, 3774 adult and adolescent allergy patients (patients 12 years and older) were randomly divided into two groups.
The patients in group 1 (experimental group) received 200 \mug of Nasonex.
The patients in group 2 (control group) received a placebo.
Is there evidence to conclude that the proportion of Nasonex users who experienced headaches as a side effect is greater than the proportion in the control group?
Test at the \alpha= 0.05 level of significance.
What are the important pieces?
Of the 2103 patients in the experimental group, 547 reported headaches as a side effect.
Of the 1671 patients in the control group, 368 reported headaches as a side effect.
2-sample test for equality of proportions without continuity correction
data: c(547, 368) out of c(2103, 1671)
X-squared = 8.0618, df = 1, p-value = 0.004521
alternative hypothesis: two.sided
95 percent confidence interval:
0.01255827 0.06719613
sample estimates:
prop 1 prop 2
0.2601046 0.2202274
Thus, \hat{p}_{\text{Exp}} = 0.260, \hat{p}_{\text{Ctrl}} = 0.220 and \hat{p}_{\text{Exp}} - \hat{p}_{\text{Ctrl}} = 0.04.
The 95% CI for \hat{p}_{\text{Exp}} - \hat{p}_{\text{Ctrl}} is (0.013, 0.067).
2-sample test for equality of proportions without continuity correction
data: c(547, 368) out of c(2103, 1671)
X-squared = 8.0618, df = 1, p-value = 0.00226
alternative hypothesis: greater
95 percent confidence interval:
0.01695043 1.00000000
sample estimates:
prop 1 prop 2
0.2601046 0.2202274
Hypotheses
Test Statistic and p-value
Rejection Region
Conclusion / Interpretation
Reject H_0.
There is sufficient evidence to suggest that the proportion of Nasonex users who experienced headaches as a side effect is greater than that of the control group.
The goodness-of-fit test allows us to determine if a frequency distribution follows a specific distribution.
This could be a named distribution (e.g., normal)
It could also be a distribution without a name (e.g., the probabilities are specified)
Before we can perform the goodness-of-fit test, we must compute expected counts.E_i = n p_i
Hypotheses
Test Statistic
p-Value
Rejection Region
chisq.test()
function and plug in both the counts and the expected probabilitiesUsing the economy data, below (based on the 2017 Current Population Survey, adjusted for inflation), determine if there is evidence to suggest that the distribution of income has changed since 2000.
Test at the \alpha = 0.05 level of significance.
Income | Observed | Probability |
---|---|---|
Under $15,000 | 161 | 0.099 |
$15,000 - $24,999 | 144 | 0.098 |
$25,000 - $34,999 | 138 | 0.093 |
$35,000 - $49,999 | 184 | 0.135 |
$50,000 - $74,999 | 247 | 0.179 |
$75,000 - $99,999 | 188 | 0.131 |
$100,000 - $149,999 | 217 | 0.149 |
$150,000 - $199,999 | 105 | 0.061 |
Over $200,000 | 116 | 0.055 |
counts <- c(161, 144, 138, 184, 247, 188, 217, 105, 116) # create O_i vector
probs <- c(0.099, 0.098, 0.093, 0.135, 0.179, 0.131, 0.149, 0.061, 0.055) # create p_i vector
chisq.test(counts, p = probs)
Chi-squared test for given probabilities
data: counts
X-squared = 20.693, df = 8, p-value = 0.00801
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Reject H_0.
There is sufficient evidence to suggest that the distribution of income in 2017 does not follow the same distribution as in 2000.
An obstetrician wants to know whether the proportion of children born on each day of the week is the same.
She randomly selects 500 birth records and obtains the data shown in the table below (based on data obtained from Vital Statistics of the United States, 2016).
Is there reason to believe that the day on which a child is born does not occur with equal frequency at the \alpha = 0.01 level of significance?
Sun | Mon | Tues | Weds | Thurs | Fri | Sat |
---|---|---|---|---|---|---|
46 | 76 | 83 | 81 | 81 | 80 | 53 |
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Let us now discuss testing two categorical variables to determine if a relationship exists.
Take, for example, this data:
We will use the \chi^2 test for independence to determine if happiness depends on marital status.
E_{ij} = \frac{R_i C_j}{n}
Hypotheses
Test Statistic
p-Value
Rejection Region
matrix()
(see example) and use the chisq.test()
function.If given raw data, we can use the CrossTable()
function in the gmodels
package.
observed_table <- matrix(c(600, 63, 112, 144,
720, 142, 355, 459,
93, 51, 119, 127),
nrow = 3, ncol = 4, byrow = T)
# I prefer to include breaks to make it look like the table given just for checking purposes
# make sure you edit the number of rows (nrow) and columns (ncol)!
rownames(observed_table) <- c("Very Happy", "Pretty Happy", "Not Too Happy") # name rows
colnames(observed_table) <- c("Married", "Widowed", "Divorced/Separated", "Never Married") # name cols
observed_table # print table to make sure it is what we want
Married Widowed Divorced/Separated Never Married
Very Happy 600 63 112 144
Pretty Happy 720 142 355 459
Not Too Happy 93 51 119 127
Pearson's Chi-squared test
data: observed_table
X-squared = 224.12, df = 6, p-value < 2.2e-16
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Reject H_0.
There is sufficient evidence to suggest that happiness depends on marital status.
CrossTable()
function works, let’s explore the Palmer penguin dataset.library(gmodels)
penguins <- palmerpenguins::penguins
CrossTable(penguins$species, penguins$sex,
prop.chisq= FALSE, # turn off proportion contributed to chi-square statistic
prop.t = FALSE, # turn off total proportions
chisq = TRUE) # request chi-square test
Cell Contents
|-------------------------|
| N |
| N / Row Total |
| N / Col Total |
|-------------------------|
Total Observations in Table: 333
| penguins$sex
penguins$species | female | male | Row Total |
-----------------|-----------|-----------|-----------|
Adelie | 73 | 73 | 146 |
| 0.500 | 0.500 | 0.438 |
| 0.442 | 0.435 | |
-----------------|-----------|-----------|-----------|
Chinstrap | 34 | 34 | 68 |
| 0.500 | 0.500 | 0.204 |
| 0.206 | 0.202 | |
-----------------|-----------|-----------|-----------|
Gentoo | 58 | 61 | 119 |
| 0.487 | 0.513 | 0.357 |
| 0.352 | 0.363 | |
-----------------|-----------|-----------|-----------|
Column Total | 165 | 168 | 333 |
| 0.495 | 0.505 | |
-----------------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test
------------------------------------------------------------
Chi^2 = 0.04860717 d.f. = 2 p = 0.9759894
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation