Answer and explanation to follow in this thread. For the record, at this time there's ~1400 votes.
About 99% -- 46%
About 90% -- 11%
About 67% -- 22%
I'm a Bayesian and already know this stuff -- 21% https://twitter.com/bburkeESPN/status/1297559780752363521
The answer is 67%.

It's not a trick question, it's just one of the counter-intuitive qualities of probabilities. Don't feel bad if you got it wrong--it's cited in many places that the vast majority of Dr's get this wrong too.

To solve it, the easiest and best way is with Bayes
Here's a gentle explainer. https://arbital.com/p/bayes_frequency_diagram/?l=55z&pathId=65556

But I'll walk through the calculation here as well.

Bayes theorem is not a 'theory' or estimate. It's a mathematical identity, no less true than saying: x = 1/(1/x) or that 2+3 = 3+2.
Let's say "B" is the event someone actually has the virus. And say "A" is the event that one has tested positive. What we want to know is the probability of B (is true) given A (is true). This is commonly written as:
P(B|A)
The symbol "|" means "given" (it isn't a math operator).
Bayes Theorem says:
P(B|A)=P(A|B)*P(B) / P(A)

P(B|A) is what we want to know. That's called the "posterior."
P(A|B) is called the likelihood. That's the chance of testing positive given you have the virus. In this case it's 99% -- the true positive rate.
P(B) is called the "prior." This is the part most people overlook. In this case it's very low. We've said only 2% of the tested population has the virus. (This is about the rate of positives for players when they reported to camps, by the way.)
The denominator "P(A)" is known as "the evidence." It's a little harder to calculate for this problem. In simple terms we say the p(A) is the same as:
P(A and B occurring together), plus
P(A and not B occurring together.
P(A and B) = P(A|B)*p(B). And we already know both of those things from the problem.
P(B|A) is our true positive rate (.99). And p(B) is our prior (.02). So
P(A and B) = .99 * .02 = .0198
The 2nd part of the denominator is P(A and not B). This can be written as:
P(A|not B) * P(not B).
P(A|not B) is the false positive rate (.01)
P(not B) is all the people without the virus... 1 - P(B) = .98
Together, they multiply to .0098.
The denominator is the sum of .0198 + .0098 = .0296. We now have each part of Bayes theorem and can calculate the chance one has the virus given he has tested positive (under our stated assumptions about the prior, true-positive, and true-negative rates.
P(B|A) = P(A|B) * P(B) / P(A)
= .99 * .02 / .0296
= .67

The overall lesson is that for tests with some false positives and a very low general positive rate, a test that comes back positive is more likely to be false than we'd intuitively think. Much more likely.
But keep in mind a few things. This is for someone who doesn't have other reasons to believe they're infected, like having symptoms or being in contact with those that do.

There are many types of COVID tests, and some have virtually zero false positive rates.
Also "false-positive" gets used as a misnomer sometimes. Some of these tests can detect the presence of remnants of the virus, even though someone fought it off days ago or longer. From the test's perspective it is correctly detecting virus, but medically you may not be infected.
And the spate of false positive for NFL teams lately is possibly due to a problem with a lab, and so they're results are correlated and not quite the same thing as the low base-rate phenomenon that is so counter-intuitive.
And lastly a disclaimer. It should be clear from the get-go I said "Let's say..." So the numbers here are plausible, but are for discussion purposes only. Do not under any circumstances say 'ESPN guy says only 67% of all COVID tests are correct' or anything like that!
You can follow @bburkeESPN.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: