Today I'll be live tweet-reviewing my:

20th COVID-19 paper

6th paper purporting to be able to show mass-events do/don't spread COVID

3rd Dave et al. paper already this quarter using the same bad research design

It's starting to feel like if Sisyphus ran a methods seminar https://twitter.com/SDSUCHEPS/status/1302480031638147074
If you haven't been following my review series, I recently synthesized some of the empirical and normative problems of observational COVID-19 research, out of the social sciences, on the question of mass-events spreading COVID-19 in this thread here: https://twitter.com/RexDouglass/status/1302503391973634048
For background on why observational COVID research is so bad right now see our short write-up here. In brief, we don't have an actual measure of infections in the U.S., just a bunch of proxies, and so we shove bad data into bad models and declare victory. https://twitter.com/RexDouglass/status/1295391498477826053
The Dave et al. research design which they've used 3 to 6 times already this year is the pinnacle of this problem. They want to know if mass-events (protests, conventions, rallies) spread covid. But we don't have individual level data on attendees and comparable stay-homes. So...
So they resort to a diff-in-diff, looking to see if a place has more,less, or the same number of confirmed cases soon after an event than they 'should.' The argument is that the trend line for an entire location after time T can tell us if what happened on T is safe or risky.
For why this research design does not answer that question, imagine running your own experiment. Go outside and cough in a stranger's face right now. Now if next week your county's confirmed case rate goes up, that's bad behavior, stays the same it's ok, and goes down it's good!
How well this experiment fairly captures what Dave et al are doing depends on specifics:
What size of treatment are they considering 1 face cough or 30% of the community (mobility)
Rates of existing infection (I_t) and transmission (R_t)
The counterfactual of what R_t+1 should be
So any paper making claims based on this design is going to have to be super super careful on all of those things- really nailing the epidemiological model of COVID-19 spread before diving in on the effect or non-effect of a treatment. You don't get to skip that part of science.
To set the floor for how cavalier you can be with the epidemiological model of COVID-19 part- this draft doesn't even know which source of data it's using for COVID-19 cases. They're using NYT county data, but mention CDC, JHU, and Kaiser because they don't know who imports who.
It doesn't get better from there. What we want to know is the transmission rate at Sturgis, e.g. how many people went, how many were likely sick, how many were sick after, and therefore how easily they transmitted while there. Is that rate higher, lower, same of other mobility.
Mass-events during a pandemic are inherently bad because they break the measurement needed to answer that question. Thousands of people breaks process tracing. With protests states literally refused to do the tracing. That alone makes it a different kind of mobility.
So in the absence of individual level data, we're depending entirely on an accurate measure of infections in entire populations, which we also don't have. Confirmed cases are a product of both infection and testing. A treatment is going to affect both of them, eg Kubinec et al :
Dave et al are aware that the treatment,Sturgis, has large non-zero causal effects on those other parts of the epidemiological model. Testing was explicitly part of the rally. I'm sympathetic but you don't just get to ignore this. The effect on confirmed cases is infection+tests
My sympathy evaporates where they have another proxy for infections less sensitive to testing that they explicitly reject because it doesn't vary from before to after the treatment. At this point, we're no longer in well-intention-ed territory trying to help given limited info.
Because the data are sparse, the right answer is "we don't know." The universe doesn't owe us an answer. Here's a model that tries to estimate infections for all of SD, the confidence intervals all overlap, and the trend was already increasing.
https://covid19-projections.com/us-sd 
Ok so given the above, no matter what they find, it won't answer the question we want answered. In that context, let's talk about what they find. Some combination of Infections+Tests per capita is higher after the rally than some counterfactual county, counties/ and state.
Their synthetic control is other counties-excluding SD and neighboring states, counties with too high or too low urbanicity or population density.). They match on mobility,cases and deaths per capita for the previous 28 days.
Table 3 has the main result. It's confusing what they vary across models 1, 2, and 3. It's clearer what they vary across the panels. Let's just take model three. 3.5x post sturgis confirmed cases at county, 2.2x neighbors, 1.9 state. That spatial attenuation is good for them.
They for the most part get the gradient they want on cases exported from Sturgis to other places. They get the right gradient (stronger over time) and higher effect for larger flow. It's noisy but interesting.
The interaction with state mitigation policies I don't buy. The kinds of people who went to this rally, came from places with weak policies, which were having their upswings in this part of the year mechanically already. I think that's really stretching identification.
Having reviewed 20 (read > 100 now) observational COVID-19 papers this year my main takeaway is that each individual team has strengths/weaknesses and incentives about what they do/don't want to find that lead them to do certain parts of the analysis really well and not others.
To contrast 2 groups: Kubinec et al got modeling latent infections right, cared about the DV, but then the back-end was a mess when they needed to find stars. Davies et al do a really good job on the back-end measuring mobility and completely ignore the front end on infections.
Their model puts them (and me) in an awkward position. Their first 2 papers on protests and Trump's Tulsa rally applied this broken model to say it's safe to have mass-events. I critiqued those from the point of, absence of evidence isn't evidence of absence. Totally reckless.
Here we've got the opposite where Sturgis almost surely spread COVID-19, it would be shocking if it didn't. And so now I have to ask whether I'm OK with a bad model being wrong if the finding is probably right. I come down on no again for a few reason ...
First, as a scientific principle you don't say you can know something that you can't. Entitlement that we're owed answers even when the data are bad has lead to more human suffering than most. Other groups are starting to cite this model and emulate it and that needs to stop.
Second, conceding this model as appropriate retroactively makes those earlier papers on protests and Trump rallies not spreading COVID legitimate. If we think this is a real measure of infection change over time then it's literally true those events didn't spread COVID somehow.
Finally, over-estimating isn't cost-less either. There's literally going to be a lawsuit someday that cites this. WaPo is writing it up right now in their culture war series. You don't accuse people of 263,708 cases, $12.2 billion, and thousands of deaths unless you know it.
What I want to see going forward is (1)all of the work on mobility carved off and put in its own paper that I can actually cite. Those are hard measurements with stable censors, and that work throughout has been interesting and earnest. Write it up, get credit for it, help others
Next, I want the front and back-end people to get their shit together and join forces so we have a real legitimate model of infections. Somebody release an accurate model, with Bayesian draws we can import, and we'll rerun all these and watch the confidence intervals explode.
If we're not going to get a real testing and tracing system in this country, we're going to have to be experts at explaining our uncertainty with the existing data we have. The legitimacy of expertise in this country depends on it, and there's clearly smart people who can do it.
If you found this review useful, this is the 20th (yay...) COVID-19 paper I've reviewed this year and you can find the others here: https://twitter.com/RexDouglass/status/1278115752747253760
You can follow @RexDouglass.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: