Thread by @raj_mehta, 1/ A tweetorial on false positives & transposing the conditional!Or why interpreting [...]

1/ A tweetorial on false positives & transposing the conditional!

Or why interpreting type 1 error can be so confusing.

This is a common topic that overlaps in medical testing, data reporting, and statistical hypothesis tests (NHST).

2/ Lets start with some basic terms!

In binary classification, say for medical test results, their are 4 basic categories.

True positive
True negative
False positive
False negative

3/ We are interested in the False positive.

A false positive medical test result is when a patient DOES NOT have a condition, but the test (predicted condition) is Positive. See the table below.

4/ A False Positive in statistical hypothesis testing is when the null hypothesis (H0) is true, but the test has a result compatible with the alternative hypothesis (H1).

In the table below, this is like detecting a signal, when no signal is present.

5/ In fairy tales, a False Positive is also known as a False Alarm. Its the story of the boy who cries wolf, when no wolves are present.

6/ Its important to note the a False Positive is a *category*.

By itself, a False Positive *is not* a fraction, or a probability, or a percentage, or a rate. It is only a count.

Below is an example of a study, where 180 test results were counted as False Positives.

$6/ Its important to note the a False Positive is a *category*. By itself, a False Positive *is not* a fraction, or a probability, or a percentage, or a rate. It is only a count.Below is an example of a study, where 180 test results were counted as False Positives.$

7/ Unsurprisingly, False Positives (FP) & False Negatives (FN) are considered Errors.

In the 1920s, the statisticians Neyman and Pearson identified these possible errors of NHST, and creatively decided to label them as "Type 1 Errors" (FP) and "Type 2 Errors" (FN).

8/ Mathematically, "Type 1 error rates" are basically a ratio of False Positives divided by the Total # of Negatives:
FP / (FP + TN)

Even though "Type 1 errors" (count) is not same as "Type 1 error rate" (ratio), we often use the terms interchangeably.

Its so confusing!

9/ Type 1 error rate can also be expressed as a probability (α):

Prob(FP | H0 true): If the null hypothesis is true, it& #39;s the probability of falsely rejecting it.

Prob(FP | condition absent): If the condition/disease is absent, it& #39;s the probability of falsely diagnosing it.

10/ OK! Following so far? Hopefully this has set the stage for the fun part of the tweetorial.

Because now we dive into the rabbit hole!

11/ Scenario 1: Lets say we have a positive medical test result, but are unsure if a disease is truly present. We want to know if the test result is wrong, i.e. a False Positive or Type 1 Error.

How do we determine the chances that this a false positive? What do we calculate?

12/ The correct answer is "Other".

We want to know: Given a positive result, what is probability that the condition is absent. Prob(Condition absent | Pos Test).

But Type 1 Error Rate (α) is Prob(FP | Condition absent), the inverse (transposed conditional) of our question!

13/ So in our scenario, the "Type 1 error rate" (α) does not help us determine the probability of having a false positive.

Or, to be doubly confusing: in our scenario the "Type 1 Error Rate" (α) does not help us determine the probability of having a "Type 1 Error".

14/ Now, to be triply confusing: "Type 1 Error" is often used interchangeably with "Type 1 Error Rate". As such, the the "probability of Type 1 error rate" (α), is often defined just as "the probability of type 1 error".

Got that?

15/ Yet the (α) defined as "probability of a Type 1 error" is still the conditional inverse of our attempt in the previously stated scenario to calculate the "probability of a false positive" or "probability of a Type 1 error".

Are you confused? Im feeling pretty confused.

16/ What is the solution out of this semantic and transposed conditional quagmire?

The best advice I& #39;ve heard is this:

1) use conditional statements for probability
2) limit usage of the phrase "type 1 error" to study design
3) avoid pos/neg labels for non-binary test results

17/ Scenario 1: We have a positive test result, but are unsure if the disease/condition is truly present.

Confusing Q: Is this a false positive or type 1 error?
Good Q: Given the positive test result, what is the probability that the condition is absent?

18/ Scenario 2: A study has made a new discovery, and we want determine the chances it is wrong.

Wrong Q: What is the chance of type 1 error rate (α)?
Confusing Q: What is the chance of a false positive?
Good Q: Given the study results, what is prob the discovery is false?

19/ When discussing False Positives, always keep in mind

i) am i discussing the "category" and counts?
ii) am i discussing rates & probability?
iii) for probability questions, what is the conditional order?

20/ End

I hope this thread has been helpful introduction to the topic. Feel free to post corrections or comments.

@ADAlthousePhD @venkmurthy @JeremySussman @THilalMD @ihtanboga @mikejohansenmd @dailyzad @boback

Latest Threads Unrolled: