I just saw a study that said that there’s a 59% a single person in a group of ten in Chicago has Covid, highest in the country. I freaked out until I read what they graphic actually meant.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        My inclination was to interpret it the way everybody does when presented a statistical claim—as a population claim—even though I know better. And interpreted as such, My first impulse would be to get a Covid Test &, If negative, leave Chicago and drive elsewhere.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        I went to the data source—and, first of all, the graphic I saw ~*sort*~ of misrepresented the estimators data source—they used the updated seroprevalence field studies of the CDC BUT their county level data came from a NY times aggregator
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Now, using the above, they:
1. Took the higher estimate of the seroprevalence & case reports
2. Multiplied them by the higher skew factor of 10 (median is 5) Basically to measure for under testing
3. Then they estimate the chance that any individual has Covid
                    
                                    
                    1. Took the higher estimate of the seroprevalence & case reports
2. Multiplied them by the higher skew factor of 10 (median is 5) Basically to measure for under testing
3. Then they estimate the chance that any individual has Covid
                        
                        
                        This procedure is legitimate, and to the credit of the data graphics data source they are up front about all of the above and allow the user to change the specifications of it.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        And while the producer of the graphic didn’t lie—they did say chances an individual has Covid in group of X size or more—the individual rate suggested by 60% for a group of ten is about 5%. But their presentation was purposefully obfuscatory & meant to scare.
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        So, based on taking higher field study seroprevalence estimates & the higher range from NY times county & news aggregator data set, multiplying by the higher skew factor of ten, this suggests 5% of people in Chicago test positive for Covid at contagious levels.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Note the NYTimes is based off aggregating county level reported cases while the CDC is based off their various testing regimes. The two, as anyone can tell you, are not exactly the same thing lol.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        For example, if a county has limited testing resources, but among the people who get Covid, more go to the hospital/seek care, and more get correctly diagnosed as having it, their estimates will be driven *upward* by this model three separate times.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        I can’t speak to all of it, but for Chicago, at least, limited testing availability, but high admissions & treatment etc, characterized the city at the beginning of the pandemic. I’d have to look more closely now, and I’ll get on that.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Here’s the Graphic I saw  https://www.instagram.com/p/CH0cTHEAObW/?igshid=qkctjc7ml50f">https://www.instagram.com/p/CH0cTHE...
                        
                        
                        
                        
                                                    
                            
                                                
                    
                    
                                    
                    
                        
                        
                        Here is the risk assessment tool—immediately notice the difference in presentation, sourcing, framing, estimating, and explaining. This tool is pretty cool, but it cannot be used to make reified facts to share as memes on Insta like the above! https://covid19risk.biosci.gatech.edu"> https://covid19risk.biosci.gatech.edu 
                        
                                                
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        And here are its data sources: https://covidtracking.com/api/ ">https://covidtracking.com/api/"...
                        
                                                
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        This use of this latter one is more sketchy than the former but it’s probably one of the few available https://github.com/nytimes/covid-19-data">https://github.com/nytimes/c...
                        
                                                
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Anyway, the U of Georgia tool is meant to be heuristic—the point of the tool is that, small reported & estimate rates in other media & data sets may translate into disproportionate risks in groups.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        It’s also educational—it Basically is a hands on way to show people how epidemiology, data aggregation, & statistics work, & how those assumptions alter our inferences & conclusions.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        The Insta graphic took the above, stripped it of these aspects, makes it hard to track down the above, & then presents it in a misleading way, meant to scare people & to travel far on the Internet.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        By the way, this is basically my critique of half of y’alls use of data metrics that have become meme-ified & reified—such as the ‘100 companies’ meme, which is an even worse chain of inferences & elisions than this Insta meme
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        I need to look deeper to see how NYT avoids double counting & certain reporting biases. As always multiple sets of data to confirm—WHO, CDC, local health groups, private foundations, news cases, aggregators etc is better.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        I also don’t know how to analyze people who may have already had Covid. WHO say roughly as 52 days ago, 10% of the world has probably contracted it ( .5% confirmed). 
https://www.who.int/docs/default-source/coronaviruse/situation-reports/20201005-weekly-epi-update-8.pdf
https://www.who.int/docs/defa... href=" https://www.who.int/bulletin/online_first/BLT.20.265892.pdf">https://www.who.int/bulletin/... https://apnews.com/article/virus-outbreak-archive-united-nations-54a3a5869c9ae4ee623497691e796083">https://apnews.com/article/v...
                            
                                
                                
                                
                            
                            
                        
                        
                        
                        
                                                
                    
                    
                                    
                    https://www.who.int/docs/default-source/coronaviruse/situation-reports/20201005-weekly-epi-update-8.pdf
https://www.who.int/docs/defa... href=" https://www.who.int/bulletin/online_first/BLT.20.265892.pdf">https://www.who.int/bulletin/... https://apnews.com/article/virus-outbreak-archive-united-nations-54a3a5869c9ae4ee623497691e796083">https://apnews.com/article/v...
                        
                        
                        At that point in time, 34 million or so had been confirmed. Now 60 million have. If 34 million, .435% of of world pop suggested 10%, then 60 million suggests just under 18%. Of course testing frequency & accuracy changes but this is part of the problem.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Some of these absurdly highly fatality rates & absurdly low ones are highly Sus but here’s the data  https://coronavirus.jhu.edu/data/mortality ">https://coronavirus.jhu.edu/data/mort...
                        
                        
                        
                        
                                                    
                            
                                                
                    
                    
                                    
                    
                        
                        
                        For example, above, I find Yemen & Mexico’s absurdly high rates hard to believe for various reasons while for these I find about half very suspect.
                        
                        
                        
                        
                                                    
                            
                                                
                    
                    
                                    
                    
                    
                                    
                    
                    
                                    
                    
                    
                
                 
                         Read on Twitter
Read on Twitter 
                             
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                     
                                    