This preprint is getting a lot of likes and retweets. But a correlation of .994 when one of the variables is an integer in the range (0,29) seems... optimistic. /1 https://twitter.com/BrennanSpiegel/status/1265119535901732865
The data don't seem to be readily available, so I digitised them by eye from Figure 2A. This will certainly be imperfect, but it's quite easy to calibrate the red [virus] dots, since the grey [admissions] dots must be integers. /2
(There is apparently software to do this, but I don't know if that can handle the values of interest being on two Y-axes.) /3
I don't do Git, but I can make the CSV file available if anyone wants it. Otherwise the numbers are all in the image on the previous tweet, and you don't even need the dates. /4
Then I calculated the Pearson correlation between virus concentration and admissions 0, 1, 2, etc days later. /5
Here are the results. Not quite 0.994. And the largest correlation is on the day after the samples were taken (whereas the preprint says the best results were found after 3 days). /6
There is also the issue that the successive readings from day to day are probably not independent, but I'll leave that to people who understand the consequences of that better than me. /7
However, my provisional conclusion is that we probably shouldn't go round trying to predict hospital admissions from sewage sludge just yet. /8 /end
Tagging in authors @SaadOmer3 @jordan_peccia @NathanGrubaugh @WeinbergerDan, tweeter @BrennanSpiegel, and stats people who commented @stephensenn @RichardTol
.
You can follow @sTeamTraen.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: