Data from 3,000 people at grocery and bigbox stores. No info on what fraction refused to provide sample (which indicates potential size of bias). No information reported on test used or test characteristics. So, no idea about false + or- %. No info on date when study was done.
Most important: Bias from recruiting at grocery stories definitely exists. How big? We can't know w/out more data. Same for bias w/in people that were asked (those w/ covid-19 symptoms might be more likely to volunteer). Can't be >5x in NYC (b/c prev can't be >100%)
With all these unknowns, what can we conclude? 1st, this is not how data this crucially important data should be released. It would take study team 15 min to prepare doc w/ details that could address many of these issues. Cuomo is being irresponsible.
Thus we are left to make HUGE assumptions to try to make any sense of the data. Since people will draw conclusions anyway we can at least make our assumptions clear.
Raw results: NYC: 21%, 16.7% Long Island, 11.7% in Westchester and Rockland Counties; Rest of NY: 3.6%
No info on N by location.
If we assume false +s <5% (& upper CI <20%) & data were adjusted for false +, then lower CI on prevalence would exclude zero (which wasn't the case w/ Stanford study). https://statmodeling.stat.columbia.edu/2020/04/19/fatal-flaws-in-stanford-study-of-coronavirus-prevalence/
Prev CIs for individual locations could be wide if N for some locations is small. But ASSUMING evenly split among 4 regions N=750 and 0 uncertainty in test false +/-, crude CIs are: 18-24%NYC, 14-20%LI, 9.5-14%W/R, 2.4-5.2%. Uncertainty in false+/- could GREATLY widen these.
If ALL biases are small AND test is well characterized (small CI of false+ and false-), approx ratio of test + to actual infections is: 12 NYC, 20 LI, (4.4 W, 3.9R - had to split and assumed equal %), 4.0 Rest NY. These values are in range of what most have assumed (~10 (5-20)x).
Ratio of test + to actual infections is much smaller than numbers cited in Stanford studies in LA (28-55) and Santa Clara (50-85).
Differences likely due to biases in serosurveys and testing capacity. https://twitter.com/DiseaseEcology/status/1251225273871134721
So with a HUGE # of assumptions (b/c no details released w/ data and serosurvey not well designed), I think we can draw these conclusions: 1)IFR is likely w/in range from other studies - 0.3-1.2% and and much> 0.1% https://twitter.com/DiseaseEcology/status/1252844190070829056
Minimum Point IFRs (IFRvsCFR: https://twitter.com/DiseaseEcology/status/1252844190070829056)
(min b/c WITHOUT accounting for avg time b/w seroconv vs death & calc using prob + conf deaths): 0.87%NYC, 0.26%LI, 1.0%West, 1.5% Rock, 0.6% Rest NY. But CIs on all these would be large so I bet they all overlap broadly.
2) Transmission in NYC and surrounding areas was intense (didn't need serosurvey to know this).
3) under-ascertainment of cases in NY is 5-20x, as most previously assumed, not 30-85x:
4) The fraction of pop that is still susceptible is unfortunately still large (80-97%)
Hopefully study details will be released and we can greatly refine these conclusions.
Please raise Qs with any of these calculations or assumptions. It's too important to get any of this wrong.
@taaltree @joshuasweitz
You can follow @DiseaseEcology.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: