(1/x) Thread: Rapid review of the REACT-1 round 6 interim report which is making headlines with scary numbers today. To follow this one, you'll need to download the full pre-print from here: https://www.imperial.ac.uk/media/imperial-college/institute-of-global-health-innovation/REACT1_r6_interim_preprint.pdf
(2/x) First thing's first, the overall methodology of the paper (random sampling of the population) is sound and useful. However, I take issue with the selection of quoted statistics, which (of course) then make the headlines.
(3/x) Here are some numbers that need proper scrutiny. 100,000 cases per day; doubling every 9 days with a national R rate of 1.6; R rate of 2.86 in London (all quoted in https://news.sky.com/story/coronavirus-critical-stage-with-96-000-a-day-getting-covid-19-as-more-stringent-action-needed-scientists-say-12117337):
(4/x) The REACT interim report sampled 85,971 people over 10 days and found 863 positives across 9 regions.
(5/x) Now, 85,971 is a lot and 863 is a sufficient number of positives to be confident that the overall unweighted prevalence of positive samples is around 1%. However, to estimate the number of cases/day, the sample is weighted and then two huge assumptions are made:
(6/x) The study assumes (apparently without evidence) that the PCR tests are only 75% sensitive and that nobody will test positive after 10 days after initial infection. These assumptions are, IMHO, hugely skewed.
(7/x) We know that PCR tests can pick up non-viable RNA long after 10 days. We know that the tests are very sensitive. An alternative, perhaps more realistic, estimate might assume ~100% sensitivity and a period of 20 days, which would provide a headline number of 36,000 / day.
(8/x) Now onto the growth rates. ICL uses two methodologies; one a fit to a continuous series and one a fit to individual rounds.
(9/x) The problem with fitting to individual rounds is that the number of positive samples on any individual day varies hugely. Take a look at this graph again, and look at the last day of the most recent round:
(10/x) Yup, that's right, zero positive samples. That one didn't make the headlines. Over 10 days, there are only an average of 86.3 positives/day, so time series estimates based on the individual round have low statistical validity.
(11/x) This is magnified even more on a regional level. Across 9 regions, that's only 9.6 positives/day/region, which leads to very wide confidence intervals.
(12/x) The national "central estimate" is sound but the daily variations are less sound, and the regional daily variations have very low confidence. This means that the continuous time series should be used when quoting the headline estimates for R and doubling time.
(13/x) It seems that ICL agrees, in principle. The regional trends are presented only for the continuous model:
(14/x) However, for some reason, ICL has chosen to present estimates of R(t) based on both the continuous and discrete samples. Even worse, they have then cherry picked the higher R estimates from the discrete sample (1.56 nationally and 2.86 London) for the press round.
(14/x) In conclusion, the REACT study shows that R is probably in the range 1.1 to 1.2 nationally and in most regions, and that there are around 40,000 cases/day; numbers which are in line with all other data sources. It does not show anything else with any degree of confidence.
(16/x) Please retweet this thread and get it out there!
(17/x) The R rate as measured by the national testing programme in England (which has a much larger sample than REACT and includes hospitals and care homes, where R will tend higher) is currently 1.20 and "nowcast" around 1.13.
You can follow @stevebrown2856.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: