Random Variables and their Distributions

June 26, 2025
Thursday

Introduction

The first few lectures come from Mathematical Statistics with Applications, by Wackerly.
- We must understand the underlying probability and random variable theory before moving into the Bayesian world.
We will be covering the following chapters:
- Chapter 2: probability theory
- Chapter 3: discrete random variables
- Chapter 4: continuous random variables

3.1: Basic Definitions

Discrete random variable: a variable that can assume only a finite or countably infinite number of distinct values.
Probability distribution of a random variable: collection of probabilities for each value of the random variable.
Notation:
- Uppercase letter (e.g., Y) denotes a random variable.
- Lowercase letter (e.g., y) denotes a particular value that the random variable may assume.
  - The specific observed value, y, is not random.

3.2: Probability Distributions for Discrete RV

Probability function for \boldsymbol Y: sum of the the probabilities of all sample points in S that are assigned the value y
- P[Y = y] = p(y): the probability that Y takes on the value y.
Probability distribution for \boldsymbol Y: a formula, table, or graph that provides p(y) \ \forall \ y.
Theorem:
- For any discrete probability distribution, the following must be true:
  - 0 \le p(y) \le 1 \ \forall \ y
  - \sum_y p(y) = 1 \ \forall \ p(y) > 0.

3.3: Expected Values for Discrete RV

Expected value: Let Y be a discrete random variable with probability function p(y). Then the expected value of Y, E[Y], is defined to be

E(Y) = \sum_{y} y p(y)

When p(y) is an accurate characterization of the population frequency distribution, then the expected value is the population mean.

E[Y] = \mu

3.3: Expected Values for Discrete RV

Theorem:
- Let Y be a discrete random variable with probability function p(y) and g(Y) be a real-valued function of Y (i.e., a transformed variable). Then the expected value of g(Y) is given by

E[g(Y)] = \sum_{y} g(y) p(y)

3.3: Expected Values for Discrete RV

Variance: if Y is a random variable with mean E[Y] = \mu, the variance of a random variable Y is defined to be the expected value of (Y-\mu)^2.

V[Y] = E\left[ (Y-\mu)^2 \right]

If p(y) is an accurate characterization of the population frequency distribution, then V(Y) is the population variance,

V[Y] = \sigma^2

Standard deviation: the positive square root of V[Y].

3.3: Expected Values

The probability distribution for a random variable Y is given below.

y	p(y)
0	1/8
1	1/4
2	3/8
3	1/4

Find the mean of Y.

3.3: Expected Values

The probability distribution for a random variable Y is given below.

y	p(y)
0	1/8
1	1/4
2	3/8
3	1/4

Find the variance and standard deviation of Y.

3.3: Expected Values

Theorem:
- Let Y be a discrete random variable with probability function p(y) and c be a constant. Then,

E(c) = c

Theorem:
- Let Y be a discrete random variable with probability function p(y), g(Y) be a function of Y, and c be a constant. Then,

E[cg(Y)] = cE[g(Y)]

3.3: Expected Values

Theorem:
- Let Y be a discrete random variable with probability function p(y), and g_1(Y), g_2(Y), ..., g_k(Y) be k functions of Y. Then,

E[g_1(Y) + g_2(Y) + ... + g_k(Y)] = E[g_1(Y)] + E[g_2(Y)] + ... + E[g_k(Y)]

Why do we need these theorems? To get here –
Theorem:
- Let Y be a discrete random variable with probability function p(y) and mean E[Y] = \mu. Then,

V[Y] = \sigma^2 = E\left[(Y-\mu)^2\right] = E\left[Y^2\right] - \mu^2

3.3: Expected Values

The probability distribution for a random variable Y is given below.

y	p(y)
0	1/8
1	1/4
2	3/8
3	1/4

Use the previous theorem to find V[Y] and compare to our previous answer.
- Recall that \mu=1.75.

3.4: Binomial Probability Distribution

Binomial experiment:
1. The experiment consists of a fixed number, n, of identical trials.
2. Each trial results in one of two outcomes: success (S) or failure (F).
3. The probability of success on a single trial is equal to some value p and remains the same from trial to trial.
  - The probability of failure is equal to q = (1-p).
4. The trials are independent.
5. The random variable of interest is Y, the number of successes observed during the n trials.

3.4: Binomial Probability Distribution

A random variable Y is said to have a binomial distribution based on n trials with success probability p iff

p(y) = {n \choose y}p^y q^{n-y}, \text{ where } y = 0, 1, 2, ..., n, \text{ and } 0 \le p \le1

Let Y be a binomial random variable based on n trials and success probability p. Then

E[Y] = \mu = np \ \ \ \text{and} \ \ \ V[Y] = \sigma^2 = npq

See Wackerly pg. 107 for derivation.

3.4: Binomial Probability Distribution

We can use R to find information related to the binomial distribution.
- P[X = x]: dbinom(x, size, prob)
- P[X \le x]: pbinom(q, size, prob)
- P[X > x]: pbinom(q, size, prob, lower.tail = FALSE)
In the functions,
- x or q is the value of X we are interested in
- size is the sample size (n)
- prob is the probability of success, \pi
- lower.tail has two options:
  - TRUE (default) returns P[X \le x]
  - FALSE returns P[X > x]

3.4: Binomial Probability Distribution

The manufacturer of a dairy drink wishes to compare a new formula (B) with that of the standard formula (A). Each of four judges perform a blinded taste test and report which glass he or she most enjoyed. Suppose that the two formulas are equally attractive.
1. What is the probability distribution?
2. What is the mean of the distribution?
3. What is the variance of the distribution?

3.4: Binomial Probability Distribution

The manufacturer of a dairy drink wishes to compare a new formula (B) with that of the standard formula (A). Each of four judges perform a blinded taste test and report which glass he or she most enjoyed. Suppose that the two formulas are equally attractive.
Use R to find:
1. P[X = 2]
2. P[X > 2]
3. P[X < 4]

3.4: Binomial Probability Distribution

The manufacturer of a dairy drink wishes to compare a new formula (B) with that of the standard formula (A). Each of four judges perform a blinded taste test and report which glass he or she most enjoyed. Suppose that the two formulas are equally attractive.
Use R to find:
1. P[X = 2]

dbinom(x = 2, size = 4, prob = 0.5)

[1] 0.375

3.4: Binomial Probability Distribution

The manufacturer of a dairy drink wishes to compare a new formula (B) with that of the standard formula (A). Each of four judges perform a blinded taste test and report which glass he or she most enjoyed. Suppose that the two formulas are equally attractive.
Use R to find:
1. P[X > 2]

pbinom(q = 2, size = 4, prob = 0.5, lower.tail = FALSE)

[1] 0.3125

3.4: Binomial Probability Distribution

The manufacturer of a dairy drink wishes to compare a new formula (B) with that of the standard formula (A). Each of four judges perform a blinded taste test and report which glass he or she most enjoyed. Suppose that the two formulas are equally attractive.
Use R to find:
1. P[X < 4] = P[X \le 3]

pbinom(q = 3, size = 4, prob = 0.5)

[1] 0.9375

3.8: Poisson Probability Distribution

We often use the Poisson distribution to model count data.
A random variable Y is said to have a Poisson probability distribution iff

p(y) = \frac{\lambda^y}{y!}e^{-\lambda}, \text{ where } y=0,1,2,..., \text{ and } \lambda > 0

If Y is a random variable with a Poisson distribution with parameter \lambda, then

E[Y] = \mu = \lambda \text{ and } V[Y] = \sigma^2 = \lambda

See Wackerly pg. 134 for derivation.

3.8: Poisson Probability Distribution

We can use R to find information related to the Poisson distribution.
- P[X = x]: dpois(x, lambda)
- P[X \le x]: ppois(q, lambda)
- P[X > x]: ppois(q, lambda, lower.tail = FALSE)
In the functions:
- x or q is the value of X we are interested in
- lambda is the rate of occurrence
- lower.tail has two options:
  - TRUE (default) returns P[X \le x]
  - FALSE returns P[X > x]

3.8: Poisson Probability Distribution

Customers arrive at a checkout counter in a department store according to a Poisson distribution at an average of seven per hour.
1. What is the probability distribution?
2. What is the mean of the distribution?
3. What is the variance of the distribution?

3.8: Poisson Probability Distribution

Customers arrive at a checkout counter in a department store according to a Poisson distribution at an average of seven per hour. Use R to find the following probabilities.
1. No more than three customers arrive.
2. At least two customers arrive.
3. Exactly five customers arrive.

3.8: Poisson Probability Distribution

Customers arrive at a checkout counter in a department store according to a Poisson distribution at an average of seven per hour. Use R to find the following probabilities.
1. No more than three customers arrive.

ppois(q = 3, lambda = 7)

[1] 0.08176542

3.8: Poisson Probability Distribution

Customers arrive at a checkout counter in a department store according to a Poisson distribution at an average of seven per hour. Use R to find the following probabilities.
1. At least two customers arrive.

ppois(q = 1, lambda = 7, lower.tail = FALSE)

[1] 0.9927049

3.8: Poisson Probability Distribution

Customers arrive at a checkout counter in a department store according to a Poisson distribution at an average of seven per hour. Use R to find the following probabilities.
1. Exactly five customers arrive.

dpois(x = 5, lambda = 7)

[1] 0.1277167

4.2: Probability Distributions for Continuous RV

The distribution function of Y (any random varaible), denoted by F(y), is such that

F(y) = P[Y \le y] \text{ for } -\infty < y < \infty

Theorem: Properties of a Distribution Function
- If F(y) is a distribution function, then
  - F(-\infty) \equiv \underset{y \to -\infty}{\lim} F(y) = 0
  - F(\infty) \equiv \underset{y \to \infty}{\lim} F(y) = 1
  - F(y) is a nondecreasing function of y.
    - If y_1 and y_2 are any values such that y_1 < y_2, then F(y_1) \le F(y_2).

4.2: Probability Distributions for Continuous RV

Continuous random variable:
- A random variable Y with distribution function F(y) is said to be continuous if F(y) is continuous for -\infty < y < \infty.
If Y is a continuous random variable, then for any real number y, P[Y = y] = 0.
- i.e., we must remember to find the probability of an interval.

4.2: Probability Distributions for Continuous RV

Probability density function:
- Let F(y) be the cumulative density function for a continuous random variable, Y. Then

p[Y = y] = f(y) = \frac{dF(y)}{dy} = F'(y).

Theorem: Properties of a Density Function
- If f(y) is a density function for a continuous random variable, then
  - f(y) \ge 0 \ \forall y, \ -\infty < y < \infty.
  - \int_{-\infty}^{\infty} f(y) dy = 1.

4.2: Probability Distributions for Continuous RV

Cumulative density function:
- Let f(y) be the probability density function for a continuous random variable, Y. Then

P[Y \le y] = F(y) = \int_{-\infty}^y f(t) dt, \text{ for } y \in {\rm I\!R}.

4.2: Probability Distributions for Continuous RV

Theorem
- If the random variable Y has density function f(y) and a < b, then the probability that Y falls in the interval [a, b] is

P[a \le Y \le b] = \int_a^b f(y) dy.

4.3: Expected Values for Continuous RV

Expected value:
- The expected value of a continuous variable Y is

E[Y] = \int_{-\infty}^{\infty} y f(y) \ dy

This is the continuous version of the expected value for a discrete random variable,

E[Y] = \sum_y y p(y)

4.3: Expected Values for Continuous RV

Theorem:
- Let g(Y) be a function of Y; then the expected value of g(Y) is given by

E\left[ g(Y) \right] = \int_{-\infty}^{\infty} g(y) f(y) \ dy

Theorem:
- Let c be a constant and let g(Y), g_1(Y), g_2(Y), …, g_k(Y) be functions of a continuous random variable, Y. Then the following results hold:
  - E[c] = c
  - E\left[cg(Y)] = cE[g(Y)\right]
  - E\left[g_1(Y)+...+g_k(Y)\right] = E\left[ g_1(Y) \right] + ... + E\left[ g_k(Y) \right]

4.4: Uniform Probability Distribution

Uniform Distribution

4.4: Uniform Probability Distribution

A random variable Y is said to have a uniform distribution iff

f(y) = \frac{1}{\theta_2 - \theta_1}, \ \theta_1 \le y \le \theta_2

If \theta_1 < \theta_2 and Y is a uniformly distributed r.v. on the interval (\theta_1, \theta_2), then

E[Y] = \mu = \frac{\theta_1+\theta_2}{2} \ \ \ \text{and} \ \ \ V[Y] = \sigma^2 = \frac{(\theta_2-\theta_1)^2}{12}

See Wackerly pg. 176 for derivation.

4.4: Uniform Probability Distribution

An industrial psychologist has determined that it takes a worker between 9 and 15 minutes to complete a task on an automobile assembly line. If the time to complete the task is uniformly distributed over the interval 9 \le y \le 15, then determine:
1. The probability distribution.
2. The mean of the distribution.
3. The variance and standard deviation of the distribution.

4.4: Uniform Probability Distribution

We can use R to find information related to the uniform distribution:
- P[X \le x]: punif(q, min, max)
- P[X \ge x]: punif(q, min, max, lower.tail = FALSE)
In the functions:
- q is the value of X we are interested in
- min is the lower bound of the distribution
- max is the upper bound of the distribution
- lower.tail has two options:
  - TRUE (default) returns P[X \le x]
  - FALSE returns P[X \ge x]

4.4: Uniform Probability Distribution

An industrial psychologist has determined that it takes a worker between 9 and 15 minutes to complete a task on an automobile assembly line. If the time to complete the task is uniformly distributed over the interval 9 \le y \le 15, then determine the following probabilities:
1. A worker takes fewer than 13 minutes.
2. A worker takes at least 11 minutes.
3. A worker takes between 14 and 15 minutes.

4.4: Uniform Probability Distribution

An industrial psychologist has determined that it takes a worker between 9 and 15 minutes to complete a task on an automobile assembly line. If the time to complete the task is uniformly distributed over the interval 9 \le y \le 15, then determine the following probabilities:
1. A worker takes fewer than 13 minutes.

punif(13, 9, 15)

[1] 0.6666667

4.4: Uniform Probability Distribution

An industrial psychologist has determined that it takes a worker between 9 and 15 minutes to complete a task on an automobile assembly line. If the time to complete the task is uniformly distributed over the interval 9 \le y \le 15, then determine the following probabilities:
1. A worker takes at least 11 minutes.

punif(11, 9, 15, lower.tail = TRUE)

[1] 0.3333333

4.4: Uniform Probability Distribution

An industrial psychologist has determined that it takes a worker between 9 and 15 minutes to complete a task on an automobile assembly line. If the time to complete the task is uniformly distributed over the interval 9 \le y \le 15, then determine the following probabilities:
1. A worker takes between 14 and 15 minutes.

punif(15, 9, 15) - punif(14, 9, 15)

[1] 0.1666667

4.5: Normal Probability Distribution

Normal Distribution

4.5: Normal Probability Distribution

A random variable Y is said to have a normal distribution iff, for \sigma > 0 and -\infty < \mu < \infty,

f(y) = \frac{1}{\sigma \sqrt{2\pi}} e^{-(y-\mu)^2/(2\sigma^2)}

If Y is a random variable normally distributed with parameters \mu and \sigma, then

E[Y] = \mu = \ \ \ \text{and} \ \ \ V[Y] = \sigma^2

4.5: Normal Probability Distribution

We can use R to find information related to the normal distribution.
- P[X \le x]: pnorm(q, mean, sd)
- P[X \ge x]: pnorm(q, mean, sd, lower.tail = FALSE)
In the functions:
- q is the value of X we are interested in
- mean is the population mean \mu
- sd is the standard deviation \sigma
- lower.tail has two options:
  - TRUE (default) returns P[X \le x]
  - FALSE returns P[X \ge x]

4.5: Normal Probability Distribution

A random variable Y is said to have a standard normal distribution iff

Y \sim N(\mu=0,\sigma=1)

The normal distribution is then simplified to

f(y) = \frac{1}{\sqrt{2\pi}} e^{-y^2/2}

Note that in all cases of the normal distribution, we assume -\infty < y < \infty.

4.5: Normal Probability Distribution

When using pnorm(), the default values for mean and sd are 1 and 0.
Thus, if we have the standard normal our R functions simplify to:
- P[Z \le z]: pnorm(z)
- P[Z \ge z]: pnorm(z, lower.tail = FALSE)
In the functions:
- q is the z-score value of interest
- lower.tail = TRUE returns P[Z \le z]
- lower.tail = FALSE returns P[Z \ge z]

4.5: Normal Probability Distribution

A geneticist working for a seed company develops a new carrot for growing in heavy clay soil. After measuring 5000 of these carrots, it can be said that the carrot length, Y, is normally distributed with \mu = 11.5 cm and \sigma = 1.15 cm. Determine
1. The probability distribution.
2. The mean of the distribution.
3. The variance and standard deviation of the distribution.

4.5: Normal Probability Distribution

A geneticist working for a seed company develops a new carrot for growing in heavy clay soil. After measuring 5000 of these carrots, it can be said that the carrot length, Y, is normally distributed with \mu = 11.5 cm and \sigma = 1.15 cm.
1. What is the probability that a carrot will be between 10 and 13 cm?
2. What is the probability that a carrot will be less than 9 cm?
3. What is the probability that a carrot will be 12 cm or larger?

4.5: Normal Probability Distribution

A geneticist working for a seed company develops a new carrot for growing in heavy clay soil. After measuring 5000 of these carrots, it can be said that the carrot length, Y, is normally distributed with \mu = 11.5 cm and \sigma = 1.15 cm.
1. What is the probability that a carrot will be between 10 and 13 cm?

pnorm(q = 13, mean = 11.5, sd = 1.15) - pnorm(q = 10, mean = 11.5, sd = 1.15)

[1] 0.807885

4.5: Normal Probability Distribution

A geneticist working for a seed company develops a new carrot for growing in heavy clay soil. After measuring 5000 of these carrots, it can be said that the carrot length, Y, is normally distributed with \mu = 11.5 cm and \sigma = 1.15 cm.
1. What is the probability that a carrot will be less than 9 cm?

pnorm(q = 9, mean = 11.5, sd = 1.15)

[1] 0.01485583

4.5: Normal Probability Distribution

A geneticist working for a seed company develops a new carrot for growing in heavy clay soil. After measuring 5000 of these carrots, it can be said that the carrot length, Y, is normally distributed with \mu = 11.5 cm and \sigma = 1.15 cm.
1. What is the probability that a carrot will be 12 cm or larger?

pnorm(q = 12, mean = 11.5, sd = 1.15, lower.tail = FALSE)

[1] 0.3318601

4.6: Gamma Probability Distribution

Gamma Distribution

4.6: Gamma Probability Distribution

A random variable Y is said to have a gamma distribution with parameters \alpha > 0 and \beta > 0 iff,

f(y) = \frac{y^{\alpha-1} e^{-y/\beta}}{\beta^{\alpha} \Gamma(\alpha)}, \ 0 \le y < \infty

Note that \Gamma(\alpha) = \int_{0}^{\infty} y^{\alpha-1} e^{-y} \ dy.
If Y has a gamma distribution with parameters \alpha and \beta, then

E[Y] = \mu = \alpha\beta \ \ \ \text{and} \ \ \ V[Y] = \sigma^2 = \alpha\beta^2

See Wackerly pg. 187 for derivation.

4.6: Gamma Probability Distribution

We can use R to find information related to the Gamma distribution.
- P[X \le x]: pgamma(q, shape, rate)
- P[X \ge x]: pgamma(q, shape, rate, lower.tail = FALSE)
In the functions:
- q is the value of X we are interested in
- shape is the shape parameter, \alpha
- scale is the scale parameter, \beta
  - Alternatively, can parameterize with rate = 1/\beta, rate = 1 / scale
- lower.tail has two options:
  - TRUE (default) returns P[X \le x]
  - FALSE returns P[X \ge x]

4.6: Gamma Probability Distribution

Annual incomes for heads of household in an affluent section of a city have approximately a gamma distribution with \alpha=32 and \beta=2500. Determine:
1. The probability distribution.
2. The mean of the distribution.
3. The variance and standard deviation of the distribution.

4.6: Gamma Probability Distribution

Annual incomes for heads of household in an affluent section of a city have approximately a gamma distribution with \alpha=32 and \beta=2500.
1. What proportion have incomes in excess of $100,000?
2. What proportion have incomes between $75,000 and $150,000?

4.6: Gamma Probability Distribution

Annual incomes for heads of household in a section of a city have approximately a gamma distribution with \alpha=32 and \beta=2500.
1. What proportion have incomes in excess of $30,000?

pgamma(q = 100000, shape = 32, scale = 2500, lower.tail = FALSE)

[1] 0.08552057

4.6: Gamma Probability Distribution

Annual incomes for heads of household in an affluent section of a city have approximately a gamma distribution with \alpha=32 and \beta=2500.
1. What proportion have incomes between $75,000 and $150,000?

pgamma(q = 150000, shape = 32, scale = 2500) - pgamma(q = 75000, shape = 32, scale = 2500)

[1] 0.6186147

4.7: Beta Probability Distribution

Beta Distribution

4.7: Beta Probability Distribution

A random variable Y is said to have a beta distribution with parameters \alpha > 0 and \beta > 0 iff,

f(y) = \frac{y^{\alpha-1}(1-y)^{\beta-1}}{B(\alpha,\beta)}, \ 0 \le y \le 1

Note: B(\alpha,\beta) = \int_0^1 y^{\alpha-1}(1-y)^{\beta-1} \ dy = \frac{\Gamma(\alpha) \Gamma(\beta)}{\Gamma(\alpha+\beta)}.
If Y has a beta distribution with parameters \alpha > 0 and \beta > 0, then

E[Y] = \mu = \frac{\alpha}{\alpha+\beta} \ \ \ \text{and} \ \ \ V[Y] = \sigma^2 = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}

4.7: Beta Probability Distribution

We can use R to find information related to the Beta distribution.
- P[X \le x]: pbeta(q, shape1, shape2)
- P[X \ge x]: pbeta(q, shape1, shape2, lower.tail = FALSE)
In the functions:
- q is the value of X we are interested in – must be in [0, 1]!
- shape1 is the first shape parameter, \alpha
- shape2 is the second shape parameter, \beta
- lower.tail has two options:
  - TRUE (default) returns P[X \le x]
  - FALSE returns P[X \ge x]

4.7: Beta Probability Distribution

In a survey of cupcake preferences, 8 respondents liked the new cupcake flavor and 2 did not. We will model the proportion of all respondents who would like the cupcake flavor using a Beta distribution with \alpha = 8 and \beta = 2. Determine:
1. The probability distribution.
2. The mean of the distribution.
3. The variance and standard deviation of the distribution.

4.7: Beta Probability Distribution

In a survey of cupcake preferences, 8 respondents liked the new cupcake flavor and 2 did not. We will model the proportion of all respondents who would like the cupcake flavor using a Beta distribution with \alpha = 8 and \beta = 2. What is the probability that:
1. fewer than 60% of respondents like the new flavor?
2. more than 90% of respondents like the new flavor?
3. somewhere between 70% and 90% of respondents like the new flavor?

4.7: Beta Probability Distribution

In a survey of cupcake preferences, 8 respondents liked the new cupcake flavor and 2 did not. We will model the proportion of all respondents who would like the cupcake flavor using a Beta distribution with \alpha = 8 and \beta = 2. What is the probability that:
1. fewer than 60% of respondents like the new flavor?

pbeta(q = 0.6, shape1 = 8, shape2 = 2)

[1] 0.07054387

4.7: Beta Probability Distribution

In a survey of cupcake preferences, 8 respondents liked the new cupcake flavor and 2 did not. We will model the proportion of all respondents who would like the cupcake flavor using a Beta distribution with \alpha = 8 and \beta = 2. What is the probability that:
1. more than 90% of respondents like the new flavor?

pbeta(q = 0.9, shape1 = 8, shape2 = 2, lower.tail = FALSE)

[1] 0.225159

4.7: Beta Probability Distribution

In a survey of cupcake preferences, 8 respondents liked the new cupcake flavor and 2 did not. We will model the proportion of all respondents who would like the cupcake flavor using a Beta distribution with \alpha = 8 and \beta = 2. What is the probability that:
1. somewhere between 70% and 90% of respondents like the new flavor?

pbeta(q = 0.9, shape1 = 8, shape2 = 2) - pbeta(q = 0.7, shape1 = 8, shape2 = 2)

[1] 0.5788377

Homework: Discrete Random Variables

3.6
3.10
3.15
3.22
3.34
3.60
3.128
3.129
3.136

Homework: Continuous Random Variables

4.11
4.14
4.17
4.21
4.28
4.45
4.46
4.48
4.68

4.69
4.70
4.97
4.98
4.102
4.124
4.125
4.128
4.131