OK, quick thread on the 3 major types of statistical issues in seroprevalence studies—that is, studies that try to figure out how many people had COVID-19, like the one released in NY yesterday. These are:

1) Test accuracy (false negatives/positives)
2) Sample selection
3) Lags
Test accuracy has gotten the most discussion here. These tests aren& #39;t perfect. They produce false positives (it says you had it when you didn& #39;t) and false negatives (it says you didn& #39;t when you did).
Intuitively, the false positive problem should be easy to see. If a test produces, say, 1-2% false positives among healthy people, then results saying that 2-3% of people tested positive in a given region just won& #39;t really tell you very much.
Technically, the issue is not the false positive rate per se but that *we don& #39;t know what the true false positive rate* is. If we knew false positives were *exactly* 1%, we could correct for it. But it could be 0.5% or 2.5% or who knows—maybe not what the manufacturer claims.
Unless these tests are *very* accurate—and they probably won& #39;t be, since it& #39;s still the early stages of this crisis—this renders antibody studies not terribly useful in places where the underlying incidence of a disease in a population is low (say, under 5%).
However, this should be a less of a problem in places that have had a medium-to-bad epidemic; say, Belgium or London—or certainly NYC. Studies in those places should be more reliable, therefore.

Further, there is *also* the issue of false negatives.
You can follow @NateSilver538.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: