Project 1: Fall 2024

Groups:

Group 1: Maria, Andee, Hailee, Jegg
Group 2: Heather, Chris, Liz, Sara, Jacob
Group 3: Vera, Catherine, Sarlina, Lionel, Killian
Group 4: Curtis, RK, Hooman, Johnny

Consider the emotions from Inside Out. Joy, Fear, and Ennui are binge watching Scooby Doo for the first time. After a few episodes, they discuss the probability of unmasking the villain. Anxiety has recently taken an interest in Bayesian analysis and decides to exploit this opportunity and play around with the beta-binomial conjugate family. She asks each of her fellow emotions two questions:

Based on the last 10 episodes you watched, what do you think the probability of unmasking is?
How certain are you of that probability?

Perpetually bored, Ennui shrugs when asked each question. Perpetually afraid, Fear implies that he is not very confident that the gang will unmask the villain… and he’s not very confident about that, either. Perpetually positive, Joy excitedly says that she’s very sure that the probability of unmasking is high - of course the gang will get them!

What do you think the corresponding prior distributions for \(\pi\) are?

Fear: distribution here

Rationale:

Ennui: distribution here

Rationale:

Joy: distribution here

Rationale:

We will help Anxiety perform this analysis! Consider the Scooby Doo data, available through Tidy Tuesday. Go ahead and import the data and define a new variable, unmasked, that is based on unmask.fred, unmask.daphnie, unmask.velma, unmask.shaggy, unmask.scooby, unmask.other. If any of them taken on the value "TRUE", unmasked should take on a 1 (to indicate that the monster was unmasked); otherwise, it should take on a 0 (to indicate that the monster was not unmasked).

Now, let’s create a “year” variable using the year() function from the lubridate package. Note that the date.aired variable contains the original air date of the episode.

Using the year variable, let’s now create a “decade” variable.

1970-1979
1980-1989
1990-1999
2000-2010
2010-2019
2020-2021

Each group will consider different aspects of episodes.

Group members will divide and conquer the data categories in their respective areas. The outcome of interest is if the villain was unmasked or not (unmasked, created earlier).

In this project, you will:

Use the data from 1969 (first season) to update each character’s prior. What are their posterior distributions?
Perform sequential analysis and update the posterior distribution for each decade that data was collected.
Evaluate and describe the progression of beliefs of each emotion.
- Are the beliefs different based on the characteristics explored by your group?
  - Note: We have not formally learned Bayesian inference – I am asking you to use the occular method of analysis for now.

Then, your group will create a 10 minute presentation (yes, with Quarto slides) to discuss the progression of beliefs.

Because the audience of your presentation is at the same level of understanding, no introductory explanations of Bayesian analysis are necessary.
Please organize your presentation as you see fit. What is the story that your group wants to tell? Do you want to tell the story character-by-character? Decade-by-decade?
Professional development: Create a gif using gganimate that demonstrates the characters’ understanding over time.

Group 1: Trap Effectiveness

Research Question: Does the presence of a trap, and subsequently, if it works, increase the probability of unmasking the villain in Scooby Doo episodes?

Data Categories: Please use the trap.work.first variable to create a “trap” variable that indicates:

A trap was not set
A trap was set but did not work
A trap was set and worked

Group 2: Character Involvement

Research Question: Does Scooby-Doo and Shaggy’s involvement in solving the mystery increase or decrease the likelihood of capturing the villain?

Data Categories: Consider the caught.shaggy, caught.scooby, captured.shaggy, captured.scooby variables. “Caught” describes instances where a character physically catches the villain involved in the mystery. “Captured” describes instances where a character apprehends the villain, often after a trap is set. Create variables that indicate the following about the villain:

Caught or captured by Scooby
Caught or captured by Shaggy
Caught or captured by Scooby or Shaggy
Not caught or captured by Scooby or Shaggy

Group 3: Number of Villains

Research Question: Does the number of villains affect the probability of the gang successfully capturing the villain?

Data Categories: Consider the monster_amount variable. Create variables that indicate the following about the episode:

No monsters
Only one monster
Two or three monsters
Four or more monsters

Group 4: Villian Motive

Research Question: Does the villain’s motive impact the likelihood of successful capture in Scooby Doo episodes?

Data Categories: Create a new variable that classifies the villain’s motive (motive) into the following four groups:

Theft
Competition
Conquer, abduction, extortion, or trespassing
Treasure, smuggling, or inheritance