Over the past few weeks, we’ve noticed that newsrooms of all sizes—and even some government agencies—have fallen into some of the data potholes that we’ve become familiar with in our year of wrangling public COVID-19 data.
Tip 1: If you see dramatic movement in the data, look for contextual clues before interpreting it as a change in the pandemic. Day-of-week effects in data arranged by date of report produce predictable reporting swings over the course of each week.
Tip 2: Data backlogs—and the “data dumps” that occur when those backlogs are resolved—can mimic major declines and then jumps, especially in cases, tests, and deaths. Look for explanations on state dashboards and call public health officials.
Tip 3: Holiday and weather-related reporting issues happen when national or natural events occur across many states at once, and can mimic shifts in the pandemic. Look for holidays or major disruptions that might have artificially depressed—and then inflated—the data.
Tip 4: Watch out for definitional mismatches and alternate dating schemes. Be aware that different jurisdictions chose different ways of defining and reporting their metrics.
Tip 5: Get familiar with caveats. The most recent dates in epidemiological datasets are always incomplete—because, for example, the data points for people who died today won’t finish being reported for many days, weeks, or even months in the future.
Tip 6: Be cautious about what the data can say. If you’re trying to extract insights from the data itself, it can be very easy—especially within a headline—to make causal claims when only correlative evidence is available.
You can follow @COVID19Tracking.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: