mk_wait <- read_csv("https://raw.githubusercontent.com/samanthaseals/SDSII/refs/heads/main/files/data/lectures/W4_wait_times.csv")
mk_wait %>% head()In the previous weeks, we have built up what we understand about data visualization.
All three weeks, we were dealing with a continuous outcome that we assumed had a normal distribution.
In this lecture, we will focus on visualizing gamma regression models.
Recall our data for wait times at Magic Kingdom,
We constructed the model,
\ln(y) = 1.28 + 0.015 \text{ temp} + 0.11 \text{ crowd} - 0.19 \text{ family} + 0.55 \text{ thrill}
mk_wait <- read_csv("https://raw.githubusercontent.com/samanthaseals/SDSII/refs/heads/main/files/data/lectures/W4_wait_times.csv")
mk_wait %>% head()\ln(\hat{y}) = \hat{\beta}_0 + \hat{\beta}_1 x_1 + \hat{\beta}_2 x_2 + ... \hat{\beta}_k x_k
Now that the \ln() is present in the model, we need to be careful when creating predicted values.
\hat{y} = e^{\hat{\beta}_0 + \hat{\beta}_1x_1 + \beta_2 x_2 + \cdot \cdot \cdot + \hat{\beta}_k x_k}
\ln(y) = 1.28 + 0.015 \text{ temp} + 0.11 \text{ crowd} - 0.19 \text{ family} + 0.55 \text{ thrill}
Let’s have crowd index on the x-axis, wait time on the y-axis, and lines defined by ride type.
exp(),mk_wait %>% ggplot(aes(x = crowd_index)) +
geom_point(aes(y = wait_time, color = ride_type), alpha = 0.5) +
geom_line(aes(y = wait_dark), color = "#F8766D", linewidth = 1) + #
geom_line(aes(y = wait_family), color = "#00BA38", linewidth = 1) + # color = "#00BFC4",
geom_line(aes(y = wait_thrill), color = "#00BFC4", linewidth = 1) + # ,
labs(x = "Crowd Index",
y = "Wait Time (minutes)",
color = "Ride Type") +
theme_bw()Now that we have the visualization, we can link it back to interpretations and inference. Recall,
Temperature is a significant predictor (p < 0.001).
Crowd index is a significant predictor (p < 0.001).
Now that we have the visualization, we can link it back to interpretations and inference. Recall,
Ride type is a significant predictor (p < 0.001).
As compared to dark rides, we expect family rides to have e^{-0.19} = 0.827 times the wait time, or a 17.3% decrease in expected wait time.
As compared to dark rides, we expect thrill rides to have e^{0.55} = 1.733 times the wait time, or a 73.3% increase in expected wait time.
In this lecture, we learned how to visualize gamma regression models.
Again, we are building upon what we have already learned in this course.
Now that we are venturing outside of the normal distribution, we need to think carefully about how to create predicted values.
Next lecture: Beta regression