I& #39;ve seen a few papers describing the characteristics of people who tested positive for COVID-19 and this is sometimes being interpreted as describing people with certain characteristic& #39;s the *probability of infection*. Let& #39;s talk about why that& #39;s likely not true
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Down pointing backhand index" aria-label="Emoji: Down pointing backhand index">
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§΅" title="Thread" aria-label="Emoji: Thread">
1/22
1/22
2/22
Let& #39;s do some
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Thought balloon" aria-label="Emoji: Thought balloon"> thought experiments. For these, my goal is to estimate the probability of being infected with
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">COVID-19 given you have
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">Disease X
For example,
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw"> Disease X could be:
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β₯οΈ" title="Heart suit" aria-label="Emoji: Heart suit"> heart disease
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π©Έ" title="Drop of blood" aria-label="Emoji: Drop of blood"> hypertension
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β" title="Heavy plus sign" aria-label="Emoji: Heavy plus sign"> it could also be any subgroup (for example age, etc)
3/22
For example,
3/22
In these
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Thought balloon" aria-label="Emoji: Thought balloon"> thought experiments, we don& #39;t actually have perfect information about who is infected with
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe"> COVID-19, we just know among those who are
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> *tested* who has been infected with
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe"> COVID-19. This is really the crux of the matter.
4/22
4/22
For these
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Thought balloon" aria-label="Emoji: Thought balloon"> thought experiments, assume that the current tests are *perfect* (that is there are 0 false positives and 0 false negatives)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="βοΈ" title="Up pointing index" aria-label="Emoji: Up pointing index">note that this is likely not the case, with the current testing framework false (+) are unlikely but false (-) may be occurring
5/22
5/22
We want the probability of being infected with COVID-19 given you have disease X
P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">)
To get this, we need P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">) because based on Bayes& #39; Theorem we know:
P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) = P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">)P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">) / P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">)
6/22
P(
To get this, we need P(
P(
6/22
BUT, instead of P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">), we actually have P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">,
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube">) - the probability of having disease X given you have COVID-19 AND you were tested. So the crux of these thought experiments will be trying to get an accurate estimate of P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">) so that we can get back to P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">)
7/22
7/22
-
ΒΉ all numbers are made up
8/22
Why is
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Thought balloon" aria-label="Emoji: Thought balloon"> experiment
https://abs.twimg.com/emoji/v2/... draggable="false" alt="1οΈβ£" title="Keycap digit one" aria-label="Emoji: Keycap digit one"> a best case scenario?
It looks like:
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube">50% have COVID-19 among those tested
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> Of those who tested positive, the prevalence of disease X is 20%
P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) = 50%
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β
" title="White heavy check mark" aria-label="Emoji: White heavy check mark"> Reality (no relationship between disease X and COVID-19) matches what we see
9/22
It looks like:
P(
9/22
10/22
Why is
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Thought balloon" aria-label="Emoji: Thought balloon"> experiment
https://abs.twimg.com/emoji/v2/... draggable="false" alt="2οΈβ£" title="Keycap digit two" aria-label="Emoji: Keycap digit two"> bad?
It looks like:
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> 50% have COVID-19 among those tested
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> Of those who tested positive for COVID-19, the prevalence of disease X is 33%
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π±" title="Face screaming in fear" aria-label="Emoji: Face screaming in fear">
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β" title="Cross mark" aria-label="Emoji: Cross mark"> If we plug in what we see P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">,
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube">) for P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">), it looks like P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) is 82.5%, reality is 50%
11/22
It looks like:
11/22
12/22
Why is
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Thought balloon" aria-label="Emoji: Thought balloon"> experiment
https://abs.twimg.com/emoji/v2/... draggable="false" alt="3οΈβ£" title="Keycap digit three" aria-label="Emoji: Keycap digit three"> bad?
It looks like:
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> 50% have COVID-19 among those tested
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> Of those who tested positive for COVID-19, the prevalence of disease X is 11%
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β" title="Cross mark" aria-label="Emoji: Cross mark"> If we plug in what we see (P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">,
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube">)) for P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">), it looks like P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) is 27.5%, reality is 50%
13/22
It looks like:
13/22
14/22
Why is
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Thought balloon" aria-label="Emoji: Thought balloon"> experiment
https://abs.twimg.com/emoji/v2/... draggable="false" alt="4οΈβ£" title="Keycap digit four" aria-label="Emoji: Keycap digit four"> bad?
It looks like:
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> 66% have COVID-19 among those tested
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§ͺ" title="Test tube" aria-label="Emoji: Test tube"> Of those who tested positive for COVID-19, the prevalence of disease X is 66%
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β" title="Cross mark" aria-label="Emoji: Cross mark"> We& #39;re getting both the prevalence of COVID-19 *and* the it& #39;s association with Disease X wrong
15/22
It looks like:
15/22
OKAY, scenarios finished, so hopefully this highlights why we can& #39;t take the prevalence of characteristics in the *tested positive* population as the prevalence of characteristics in the overall COVID-19 population. Now, here are tips for how we can correct the numbers
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Down pointing backhand index" aria-label="Emoji: Down pointing backhand index">
16/22
16/22
Scenario
https://abs.twimg.com/emoji/v2/... draggable="false" alt="2οΈβ£" title="Keycap digit two" aria-label="Emoji: Keycap digit two">: Oversampling by 2x
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Right pointing backhand index" aria-label="Emoji: Right pointing backhand index"> take those with disease X that tested positive for COVID-19 and downweight them by a factor of 2.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β
" title="White heavy check mark" aria-label="Emoji: White heavy check mark"> the adjusted prevalence of Disease X among those that tested positive for COVID-19 (0.5 / 2.5) = 0.2 (20%)
P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) = 50%
17/22
P(
17/22
Scenario
https://abs.twimg.com/emoji/v2/... draggable="false" alt="3οΈβ£" title="Keycap digit three" aria-label="Emoji: Keycap digit three">: Undersampling by 1/2
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Right pointing backhand index" aria-label="Emoji: Right pointing backhand index"> take those with disease X that tested positive for COVID-19 and upweight them by a factor of 2.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β
" title="White heavy check mark" aria-label="Emoji: White heavy check mark"> the adjusted prevalence of Disease X among those that tested positive for COVID-19 (2/ 10) = 0.2 (20%)
P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) = 50%
18/22
P(
18/22
Scenario
https://abs.twimg.com/emoji/v2/... draggable="false" alt="4οΈβ£" title="Keycap digit four" aria-label="Emoji: Keycap digit four">: Two problems
For the prevalence of COVID-19, correct by weighing by the probability of being tested in each subgroup (
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw"> = disease X,
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β" title="Cross mark" aria-label="Emoji: Cross mark">
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw"> = No disease X)
P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">) = P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe"> |
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) + P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe"> |
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β" title="Cross mark" aria-label="Emoji: Cross mark">
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β" title="Cross mark" aria-label="Emoji: Cross mark">
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β
" title="White heavy check mark" aria-label="Emoji: White heavy check mark">P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">) = β
* 0.2 + Β½ * 0.8 = 56%
19/22
For the prevalence of COVID-19, correct by weighing by the probability of being tested in each subgroup (
P(
19/22
Scenario
https://abs.twimg.com/emoji/v2/... draggable="false" alt="4οΈβ£" title="Keycap digit four" aria-label="Emoji: Keycap digit four">: Two problems
Said another way, for calculating the overall prevalence of COVID-19, this is like downweighting the oversampled Disease X people (divide by 5).
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β
" title="White heavy check mark" aria-label="Emoji: White heavy check mark"> (β
+ 2) / (β
+ 2 + β
+ 2) = 0.56
20/22
Said another way, for calculating the overall prevalence of COVID-19, this is like downweighting the oversampled Disease X people (divide by 5).
20/22
Scenario
https://abs.twimg.com/emoji/v2/... draggable="false" alt="4οΈβ£" title="Keycap digit four" aria-label="Emoji: Keycap digit four">: Two problems
For calculating the prevalence of disease X among COVID-19 patients
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β
" title="White heavy check mark" aria-label="Emoji: White heavy check mark"> P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw"> |
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">) = P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe"> |
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) / P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">) = β
* 0.2 / 0.56 = 0.285
Again, downweight the oversampled Disease X population (divide by 5).
https://abs.twimg.com/emoji/v2/... draggable="false" alt="β
" title="White heavy check mark" aria-label="Emoji: White heavy check mark"> β
/ (β
+ 2) = 0.285
P(
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π¦ " title="Microbe" aria-label="Emoji: Microbe">|
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π§©" title="Jigsaw" aria-label="Emoji: Jigsaw">) = 80%
21/22
For calculating the prevalence of disease X among COVID-19 patients
Again, downweight the oversampled Disease X population (divide by 5).
P(
21/22
Hopefully this is somewhat helpful when reading about characteristics of those who are currently testing positive for COVID-19. As always, please let me know if there is something I& #39;ve missed!
https://abs.twimg.com/emoji/v2/... draggable="false" alt="π" title="Folded hands" aria-label="Emoji: Folded hands">
22/22
22/22