Bayesian analysis involves updating beliefs based on observed data.
Thinking Like a Bayesian
This is the natural Bayesian knowledge-building process of:
acknowledging your preconceptions (prior distribution),
using data (data distribution) to update your knowledge (posterior distribution), and
repeating (posterior distribution \to new prior distribution)
Thinking Like a Bayesian
Bayesian and frequentist analyses share a common goal: to learn from data about the world around us.
Both Bayesian and frequentist analyses use data to fit models, make predictions, and evaluate hypotheses.
When working with the same data, they will typically produce a similar set of conclusions.
Statisticians typically identify as either a “Bayesian” or “frequentist” …
🚫 We are not going to “take sides.”
✅ We will see these as tools in our toolbox.
Thinking Like a Bayesian
Bayesian probability: the relative plausibility of an event.
Considers prior belief.
Thinking Like a Bayesian
Frequentist probability: the long-run relative frequency of a repeatable event.
Does not consider prior belief.
Thinking Like a Bayesian
The Bayesian framework depends upon prior information, data, and the balance between them.
The balance between the prior information and data is determined by the relative strength of each
When we have little data, our posterior can rely more on prior knowledge.
As we collect more data, the prior can lose its influence.
Thinking Like a Bayesian
We can also use this approach to combine analysis results.
Bayes Rule
We will use an example to work through Bayesian logic.
The Collins Dictionary named “fake news” the 2017 term of the year.
Fake, misleading, and biased news has proliferated along with online news and social media platforms which allow users to post articles with little quality control.
We want to flag articles as “real” or “fake.”
We’ll examine a sample of 150 articles which were posted on Facebook and fact checked by five BuzzFeed journalists (Shu et al. 2017).
Bayes Rule
Information about each article is stored in the fake_news dataset in the bayesrules package.
In this dataset, 26.67% (16 of 60) of fake news titles but only 2.22% (2 of 90) of real news titles use an exclamation point.
Thinking Like a Bayesian
We now have two pieces of contradictory information.
Our prior information suggested that incoming articles are most likely real.
However, the exclamation point data is more consistent with fake news.
Thinking like Bayesians, we know that balancing both pieces of information is important in developing a posterior understanding of whether the article is fake.
Building a Bayesian Model
Our fake news analysis studies two variables:
an article’s fake vs real status and
its use of exclamation points.
We can represent the randomness in these variables using probability models.
We will now build:
a prior probability model for our prior understanding of whether the most recent article is fake;
a model for interpreting the exclamation point data; and, eventually,
a posterior probability model which summarizes the posterior plausibility that the article is fake.
Building a Bayesian Model
Let’s now formalize our prior understanding of whether the new article is fake.
Based on our fake_news data, we saw that 40% of articles are fake and 60% are real.
Before reading the new article, there’s a 0.4 prior probability that it’s fake and a 0.6 prior probability it’s not.
P\left[B\right] = 0.40 \text{ and } P\left[B\right] = 0.40
Remember that a valid probability model must:
account for all possible events (all articles must be fake or real);
it assigns prior probabilities to each event; and
the probabilities sum to one.
Building a Bayesian Model
Now we will summarize the insights from the data we collected on the new article.
We want to formalize our observation that the exclamation point data is more compatible with fake news than with real news.
title_has_excl
fake
real
FALSE
73.3% (44)
97.8% (88)
TRUE
26.7% (16)
2.2% (2)
We have the following conditional probabilities:
If an article is fake (B), then there’s a roughly 26.67% chance it uses exclamation points in the title.
If an article is real (B^c), then there’s only a roughly 2.22% chance it uses exclamation points.
Looking at the probabilities, we can see that 26.67% of fake articles vs. 2.22% of real articles use exclamation points.
Exclamation point usage is much more likely among fake news than real news.
We have evidence that the article is fake.
Building a Bayesian Model
Note that we know that the incoming article used exclamation points (A), but we do not actually know if the article is fake (B or B^c).
In this case, we compared P[A|B] and P[A|B^c] to ascertain the relative likelihood of observing A under different scenarios.
L\left[B|A\right] = P\left[A|B\right] \text{ and } L\left[B^c|A\right] = P\left[A|B^c\right]
Event
B
Bc
Total
Prior Probability
0.4
0.6
1.0
Likelihood
0.2667
0.0222
0.2889
It is important for us to note that the likelihood function is not a probability function.
This is a framework to compare the relative comparability of our exclamation point data with B and B^c.
Building a Bayesian Model
Event
B (fake)
Bc (real)
Total
Prior Probability
0.4
0.6
1.0
Likelihood
0.2667
0.0222
0.2889
The prior evidence suggested the article is most likely real,
P[B] = 0.4 < P[B^c] = 0.6
The data, however, is more consistent with the article being fake,
L[B|A] = 0.2667 > L[B^c|A] = 0.0222
Building a Bayesian Model
We can summarize our probabilities in a table, but some calculations are required.