1. A short thread about symmetric death curves curves and the @IMHE_UW model.

Throughout, I'll use a great new tool from @yuorme:

http://www.covid-projections.com/ 

This allows us to look at how the predictions of the IHME model have changed since it was released in late March.
2. When you look at the IHME predictions for US deaths (or health care demands), you see a rapid ramp-up followed by a long slow decline. For example, here was the original 3/26 mean projection for US deaths.
3. But when you look at the state-by-state projections, you see something different. The curves are symmetrical, at least once full social distancing is in place. Here are a few examples: Washington, New York, and California as predicted on 3/27.
4. (As an aside, for states not in full social distancing at the time of the prediction, the forecast is not symmetric. I presume this because full measures are expected to be implemented midway through the local epidemic. Again some examples below: Texas, Utah, Alaska.)
5. The basic story that the IHME model is laying out is a story in which we initially have an epidemic that is taking off. Then control measures are put into place, and the epidemic is brought under control with only about 3% of the population infected.
6. But this is a mix of two different processes, an acceleration phase prior to control, and a deceleration phase once controls are in place.

Given that, why does the model predict symmetric death curves?
7. The answer is that this is a modeling assumption that the research team has made. They have chosen to fit a particular sigmoidal curve called the Gaussian Error Function (erf function, for short) to the data representing the cumulative number of deaths that have occurred.
8. The equation for this cumulative deaths function is as follows, where t is time and the other parameters are inferred from the data.
9. The derivative of this function gives you the curve for the number of deaths per day. It is a loosely bell-shaped curve — and critically it is a symmetric function, around the parameter value b.
10. In other words, the form of the curve that the researchers use to fit the data *forces* the predicted death curve to be symmetric.

This strikes me as being extremely important to understanding how the model is going to behave as we go forward.
11. We've reached the peak in many states (hopefully). We've got the data for the up-slope of the death curve. But if I understand the IHME model correctly, it is now going to be forcing the down-slope, the back side of the curve, to be symmetric with the ramp up.
12. Let's look at an example. Here are the predictions that the IHME model has made for Washington State as the epidemic has progressed, again from @yuorme's http://www.covid-projections.com/  website.

The earlier predictions forecast a later, higher peak than we actually suffered.
13. The actual trajectory tracks the forecast pretty closely, but then starts to turn around. Once that happens, the back side of the curve is constrained to mirror the front side; the epidemic is predicted to wane quickly. By May 17, the uncertainty range collapses to 0 deaths.
14. This strikes me as unrealistic. Even ignoring the real-world coupling between states (this is not included in the model), I am not persuaded that the epidemic will necessarily be entirely finished in just over a month.
15. That depends on what R0 (R_e, strictly speaking—but they're about the same still) is in the population now. If R0 is just a shade under 1, the epidemic could stretch out for far longer. But the model cannot let that happen, because of its symmetry requirement.
16. In defense of the model, every model is a tool with a purpose. The primary purpose of their model is to predict peak health care need, not the endpoint of the outbreak. That said, people need to be aware of a model's purpose and be cautious when using it for anything else.
17. Wrapping up, let's look back at the death curve for the nation as a whole in post 1. It's not symmetric, even though the component curves are. This is unsurprising: the asymmetry largely arises from summing near-symmetric curves for each state w/ varying heights and timings.
18. The most important point is this: in practice, the death curves in US states result from two separate processes. A pre-controls expansion, and a post-controls contraction. I see no a priori reason why the contraction should occur at the same speed as the epidemic expansion.
19. Yet the IHME model forces symmetry between these phrases of the epidemic, simply because of the functional form they use to fit the data. We may see the model start to fail by large margins if this symmetry is not mirrored in the real world.
20. As always, this thread represents my good faith effort to understand what is going on in this model. If I am mistaken about any of this, I welcome criticism and correction, and will update what I have here if necessary.

Thank you for reading and goodnight.
21. Postscript and caveat. The paper ( https://www.medrxiv.org/content/10.1101/2020.03.27.20043752v1.full.pdf) states that cumulative deaths D(t,⍺,β,ρ) are assumed to follow a Gaussian Error Function:
22. But then there's a puzzling remark in the next paragraph. Does this mean that D(t,⍺,β,ρ) is actually log cumulative deaths, rather than cumulative deaths as stated previously?
23. If so, I'm wrong about the symmetry of the cumulative death function. The death curve would then be given by time derivative of e^D(t,⍺,β,ρ), which is not a symmetric function and instead has a slightly heavy right tail.
24. Some of the basic point of this thread would hold, namely that the back side of the death curve is forced into a functional form determined by the front side, without a free parameter to control their relative speeds. But the death curve would not be quite symmetric.
25. Can anyone from @IHME_UW clarify?
26. Here's an interesting example of the perversity of the curve-fit.

The model has been consistently underestimating deaths in Spain. But it has to match the back side of the curve to the front side. To do that it has to *steepen* the downward trajectory.
27. So in response to systematically underestimating deaths, it is revising its future estimates *down*, not up, to maintain the needed curve shape.

You can see this more starkly still in the lower bound of daily deaths.
28. This is not merely a quirk of the Spain data. The same thing is happening, though not quite as dramatically, in Italy.
29. Now that you know what to look for, you can see the same thing happening in the New York projections.
30. Back to that issue of whether D is cumulative deaths or log cumulative deaths. Here's a clear explanation of why my initial interpretation, that it's (non-logged) deaths, makes the most sense. (3-post thread). https://twitter.com/dsfulf/status/1250466200062251008
You can follow @CT_Bergstrom.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: