"2+2=5" AND DATA ANALYSIS AKA Why do I, a statistician, care about whether "2+2" can ever equal "5"? 1/🧵
It was unfortunate that the statement "2+2=5" became the focus of the debate because it's tied to the book "1984". "1984" is a deep meditation on linguistic totalitarianism but also ENTIRELY fictional. Torture by forced repetition of false arithmetic facts is NOT A THING! 2/
I underappreciated that people read the book when they're young. It was clearly an emotional experience for many. Also, "2+2=5" is vastly more awkward to justify than "12+12=0" which is just as jarring for the uninitiated but easily justified by a physical analogy to clocks. 3/
When the average commenter talks about “addition”, they clearly think there’s ONLY ONE definition that applies to every situation. Yet, they must be aware that these are true statements about angles:

180 + 180 = -360

359+1=0

1 + 0 = 721

It’s a mystery. 4/
This situation is hard to justify as just being about units because then you have to say "1+ 0 = 721 DEPENDING ON THE UNITS" which then means arithmetic statements are conditional on the situation. This is the point of my original (controversial) tweet! https://twitter.com/kareem_carr/status/1288838380625821696?s=20 5/
If you’ve been following me for a while then you know what I really care about is data analysis not arithmetic. So let me loop this back to data. 6/
I know people aren’t used to thinking like this but as a practical person, this is how I think. When I say “2🍎+2🍎 =4 🍎 “ vs “2🐕 +2🐕 = 4 🐕 “, are these supposed to be the same mathematical space? Are they copies of the same space? 7/
Is the 🐕 space the same as the space represented by regular arithmetic? How strong an assumption is that to make about a class of objects in the physical world? How well do I really understand the class 🐕? 8/
Let’s take something less concrete like a “covid case”. Let’s denote a “covid case” as 🤒. There’s a lot of ambiguity in what’s a case. Are you sure numerical operations on the space of 🤒 behaves exactly like the regular numbers we know and love? 9/
In my opinion, this is part of why analyzing case numbers is so hard. People with some physics training might say it’s a units issue which makes sense for physics but in the rest of life, things are much more fuzzy. 10/
There are probably hundreds of reasonable, near-compatible definitions of a covid case that interact in complex and non-linear ways not completely captured by the concept of physical “units”. 11/
Every country. Every region. Every doctor performing a diagnosis. Every diagnostic test. Every researcher conducting a study represents a potential for a new and different definition of a case. 12/
Imagine I give you a category of thing about which you know zero about. ◼️, a black box. Can you say with certainty that “2 ◼️ + 2 ◼️ = 4 ◼️”? I would argue you can’t. You need to know the properties of ◼️ to say anything. 13/
So that’s part of my case for thinking more deeply about the link between numbers and the real world. These quantification issues are a huge part of what I think about because we humans are rapidly turning everything we can perceive into data and numbers. 14/
I think the day is coming when our lives will depend on whether we’re getting this process right! 🧵/
You can follow @kareem_carr.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: