There are a lot of things I could say about this article, but in the interest of fairness, the authors at least aren’t wrong about the need to use the best, most valid inputs we can find for our models.

Epidemiologists & modelers know this well. https://twitter.com/statnews/status/1386077576330944521
The most important problem, IMO, is that the *only* way to get real-world data about an infectious disease is to let real-world infections to happen.

That means letting real-world hospitalizations, real-world deaths, and real-world long covid happen too!
This is WHY we use models—they can help us make intelligent predictions about what the data might have been able to tell us, WITHOUT letting those infections, deaths, and illnesses happen.
The second problem is that all of our fantastic methods that help us decide on action A vs B for chronic disease start to fall apart when we’re dealing with infectious disease.

Infections violate a core condition for our causal methods to work: independence between individuals.
Unfortunately, a causal model require inputs causal inputs, and these aren’t easy to get.

Sometimes, they are actually *impossible* for us to estimate from any data, even a randomized trial. https://journals.sagepub.com/doi/full/10.1177/0272989X19894940
How to make good decisions about hard questions using the best data we have, but without waiting for more people to get sick and die so that we could have perfect data, is my scientific area of expertise.

Several papers in this🧵are my best attempts to get better answers sooner.
I have been studying how to do this for more than a decade, and there are still many things I & others don’t know yet.

The one thing I definitely know: “real-world” data are never simple.
You can follow @EpiEllie.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: