There has been a lot of talk about the IHME Covid-19 projection model. @EpiEllie & I have a chat about it in tomorrow's @casualinfer episode; here is a quick description of what is going on here with a focus on the *uncertainty*

🔗 https://covid19.healthdata.org/united-states-of-america

1/14
When I look at models, I usually start with two things:

📈 What method is being used?
📥 What data is it based on?

Let's start with the methods!

2/14
📈 The IHME model is estimating the log of the cumulative death rate for a given state at a given time
🌊 Using curve fitting¹
📏 parametrized with info about the state's social distancing

--
¹ in particular it is a non-linear mixed effects model ℹ️ https://ihmeuw-msca.github.io/CurveFit/methods/
3/14
📈 Since the IHME model is trying to estimate a *curve* there are ✌️ two important pieces:

1️⃣ When will deaths "peak"
2️⃣ How many deaths will there be at the "peak"

4/14
📥 To estimate when these, the IHME model has two sources of info:

⏱ the current death rate over time for the state
📏 the social distancing measures being implemented

5/14
This information is combined with some 🌍 global info as well

👶 In the short run, the model is impacted more by the state's data
👴in the long run, they use info from locations that have seemingly already reached a peak: Wuhan, 5 in Italy, 2 in Spain

6/14
OKAY now that we know what the IHME model is doing, let's get to the good stuff - where is the uncertainty?

1️⃣ There is uncertainty that the model itself will accurately predict what will happen (it's based on a Gaussian error function - is that right?)

7/14
2️⃣ There is uncertainty in the distributional assumptions of the model
3️⃣ Even if the model is correctly specified, there is uncertainty in the parameter estimation (this is a mixed effects model, so there is uncertainty associated with fixed and the random effects)

8/14
5️⃣ There may be random uncertainty in the reported state-by-state death data
6️⃣ There is uncertainty in the reported information coming from cities that seem to have already peaked

10/14
So let's recap on the uncertainty in the IHME model:

1️⃣ model choice
2️⃣ model parameters
3️⃣ model estimation
4️⃣ data from the states (systematic)
5️⃣ data from the states (random)
6️⃣ data from the "peaked" locations

11/x
In the original model (pre-last week) the error bands you saw only accounted for 3️⃣, since then the model was updated so that the uncertainty also accounts for out-of-sample uncertainty, which I believe covers 5️⃣

12/14
The shaded red region in the model is the *uncertainty* the model accounts for, just two of the 6:

❌1️⃣ model choice
❌2️⃣ model parameters
✅3️⃣ model estimation
❌4️⃣ data from the states (systematic)
✅5️⃣ data from the states (random)
❌6️⃣ data from the "peaked" locations

13/14
This is not unusual or bad! It is just good to keep in mind the uncertainty that these projections carry with them. If all of the uncertainty we've talked about today was quantified, it's possible we'd basically have no answers to go off of 🤷‍♀️

https://twitter.com/hspter/status/1246955939946803202?s=20

14/14
Think I missed something important? Please let me know! 🙏
You can follow @LucyStats.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: