What is "fatality" of COVID19? Still lots of confusion out there about case fatality rate (CFR) and infection fatality rate (IFR). One is what you want to know but almost never gets reported (IFR) & the other one (CFR) is reported all the time but is nearly meaningless. A thread.
CFR is the fraction of CASES that go on to die. A confirmed "case" is someone that tests positive for the pathogen. Since testing is limited and varying over time, CFR is inaccurate. If not all infections become cases, which is always the case, CFR !=IFR.
In fact, I'd argue that estimating and reporting CFRs should not be done. CFR is highly context-dependent (much more than IFR but see below) & is interpreted as IFR which it can differ from by 10x+. It is entirely dependent on testing capacity, case definition and more.
In contrast, IFR is precisely what we all want to know. It is the probability of dying given that we get infected. It does not depend on testing capacity or case definition. However, it does depend on individual characteristics (for COVID19: age, pre-existing conditions, etc.).
Because chance of dying given infection (IFR) depends on individual traits it differs from pop to pop as traits do (e.g. age distribution). Thus, population IFR isn't what a layperson wants. Individual IFR would require an age- and trait-specific IFR for each "type" of person.
Some people e.g. @michaelmina_lab suggest we shouldn't report IFRs for pops b/c of differences among pops and large prob of misinterpretation. It's hard to disagree with that. However, CFR is far worse b/c it has all of IFR issues plus those huge ones listed above.
IFR + fraction of pop exposed (separately for traits that affect individ IFR if exposure varies w/ traits) can be used to estimate/predict deaths w/ models/scenarios. CFR + exposure data cannot: need + prob infection becomes a case as a function of traits (e.g. age).
So why do we see lots of reports of CFRs and almost never IFRs when CFRs are useless and IFRs are far better (but not without challenges to interpret)? Because IFRs are hard to estimate! IFRs require estimate of all infections and number of infections that die.
Estimating # of all infections is hard when some infections are asymptomatic and when testing capacity is limited&variable. Best method under these circumstances is careful serosurvey - which can provide estimate of the % of pop exposed. But these are difficult to do well.
Here's a thread about serosurveys, including some of the challenges: https://twitter.com/DiseaseEcology/status/1252473766476541952
Here's how NOT to do a serosurvey unless you are trying to rig the data: https://twitter.com/DiseaseEcology/status/1251225273871134721
If a serosurvey is done well, and we have count of total COVID deaths in that pop (which, like infections, can be underestimated), we can estimate IFR. Sounds tough, right? How can we estimate IFR before serosurveys then?
Circumstances sometimes give us (almost) what we want: estimate of all infections&deaths. Closest example: cruise ship w/ most passengers tested, & deaths recorded. Nice analysis @AdamJKucharski @timwrussell gave 1st estimate of IFR: http://doi.org/10.2807/1560-7917.ES.2020.25.12.2000256
Even this dataset not perfect (infections were underestimated, but less so, # of infections&deaths were small & age distribution heavily skewed: old). Analyses in paper address age dist. by adjusting to match China age distribution and result in China IFR: 0.6% (95%CI: 0.2–1.3)
Another study uses these data + testing of travelers to estimate total infections and also produces IFR for China age distribution: 0.66% (0.39%,1.33%) http://doi.org/10.1016/S1473-3099(20)30243-7
Overlapping datasets so not independent estimates but diff team so slightly reassuring IFR is similar.
So why are CFRs reported all the time and almost never IFRs? Because CFRs are easier to estimate (but NOT as simple as deaths/cases during epidemic: https://twitter.com/AdamJKucharski/status/1243466394991239170). IFRs are super hard w/out good serosurvey, but unique datasets and clever analysis provide initial est.
Analyses above suggest it is ~2x higher and data from NYC show 0.17% of this pop have died from COVID-19 in last month, so IFR = 0.17% is absolute minimum (if 100% of pop is infected and deaths accurate):
https://www1.nyc.gov/site/doh/covid/covid-19-data.page
See also: https://twitter.com/taaltree/status/1251373953181814784
Where does this put us? Key is to understand that IFR is population-specific & prob of death varies enormously w/ traits like age. Definitely do not report CFR as "fatality" and preferable don't report CFR at all. Public does not understand CFR and will badly misunderstand.
If you have to report "fatality" (people want to know!), state clearly estimate + range w/ nuance: e.g. fatality/IFR for for a pop w/ China age structure&traits ~0.66% and likely in range (0.39%,1.33%).
Aside: I wish @AdamJKucharski or @azraghani had reported age-specific IFRs but data likely too sparse.
However, a well-designed large serosurvey + accurate COVID death counts could provide robust estimate of IFR for THAT pop, and with enough data could estimate trait-specific (e.g. age) IFR or "fatality". Still more nuances but already very long thread.
Adding to this thread. Several age-specific & gender IFRs now exist and I should have added them earlier.
Fig for China pop (redrawn from https://doi.org/10.1016/S1473-3099(20)30243-7)
Fig & Table for France (10.1126/science.abc3517)
One based on serosurvey in Switizerland & nice analysis by @andrewazman that accounts for delays & time of serosurvey.
(Note IFRs for youngest 2 age classes are zero based on no deaths in data, but priors from Bayesian analysis increase them slightly above 0.)
You can follow @DiseaseEcology.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: