OK, friends, this is a big one. How Eugenics Shaped Statistics, my cover story for this month’s issue of Nautilus, is now online.
(Disclaimer: I also love science and hate Nazis.)
The story concerns the too-close relationship between statistics (think AP Stats class; news stories about drugs or nutrition that say “statistically significant”) and the eugenics movement (think forced sterilization programs; Nazis).
http://nautil.us/issue/92/frontiers/how-eugenics-shaped-statistics
via @NautilusMag
Movements to cancel* the famous eugenicist scientists whom a number of institutions are named after have seen a surge of popularity this year, for the same reasons as other similar initiatives to topple monuments and rename things.

*hold accountable for their words and actions
Another big developing story is the crisis of statistical methods in science, accelerated by large-scale failures of scientific results to replicate, as I explained in a previous piece. This has many practitioners debating the foundations of statistics. http://nautil.us/issue/74/networks/the-flawed-reasoning-behind-the-replication-crisis
So, this new story is about how those two stories are really one story. What’s wrong with modern statistics is that it was born out of the eugenics movement, and so it has hopelessly impure bloodlines. (See what I did there?)
You know who else was obsessed with bloodlines? Francis Galton, who coined the term eugenics and thought we could selectively breed a society of geniuses, meaning, for Galton, a society of Galton-clones. You may not know Galton’s name but you’ve definitely encountered his ideas.
Galton created the statistical concepts of “correlation” and “regression to the mean.” He was also the kind of person who recorded the attractiveness of every woman he passed on the street and measured women’s proportions at a distance using a sextant. And he *hated* Africans.
Karl Pearson was Galton’s protégé and the founder of mathematical statistics. He had some white-nationalist/eugenicist views that would make Richard Spencer blush. Pearson seemingly never met a social problem he thought couldn’t be solved with genocide.
Ronald Fisher, a dyed-in-the-wool eugenicist, was easily the most influential scientist of the 20th century and is in the conversation for most influential all-time. Statistics, as used in millions of papers in every discipline, is what it is today largely because of Fisher.
People outside of statistics may find it surprising that its most influential figures were also ardent eugenicists. It may only seem worthy of a footnote, though, until you consider how that commitment shaped their (hence our) approach to interpreting experimental data.
Inside the discipline, the fact that Galton, Pearson, and Fisher — the titans of statistics — were also eugenicists seems to have a weird status as something everyone already knows and yet is constantly surprised to learn.
If nothing else, I hope to encourage people who use the tools of G/P/F and have a passing familiarity with their eugenicist views to go read them in their own words. Take, say, 2 years and get inside their heads, as I have, then tell me how you feel about ANOVA.
The real problem, I think, is that G/P/F were also, undeniably, geniuses at answering the theoretical questions they set up for themselves and also extremely effective at communicating their more mathy ideas in ways that seemed authoritative and convincing.
Fisher’s math arguments, in particular, were brilliant in ways that give me chills: first, as someone who appreciates their beauty, and second as a human who knows the horrible real-world programs they enabled. It’s like if Josef Mengele was also a virtuoso surgeon.
(For those worried that my brush is too broad: Fisher was not shy about expressing his sympathetic leanings towards the eugenics programs conducted by the Nazis. Yes, the actual Nazi Party. Yes, after the Holocaust.)
Deriving things like the formula for the probability distribution of the sample correlation coefficient for a bivariate normal population isn’t inherently a fascist enterprise. The problem is how people use those formulas with real live data to interpret the world around them.
Deriving mathematical formulas is not the essence of statistics. Deriving meaning from messy real-world observations is.
The cleverness of these mathematical calculations has a way of obscuring whether these were ever the right questions to be asking in the first place. It turns out “statistical significance” is actually a mostly meaningless concept with no tangible real-world utility.
That’s not just my opinion. Last year, the American Statistical Association advised us to stop saying “statistically significant” because it was too often misused to suggest importance where there is none, and conversely, deny importance where it exists. https://www.tandfonline.com/doi/full/10.1080/00031305.2019.1583913
(BTW, how’s that working out? Just 39,000 citations in 2020 alone [plus many more if you count the “p<0.05”s and asterisks in tables of regression coefficients] Yikes.)
Statistical trickery 101: Collect enough data and you’ll find some statistically significant differences, just very tiny ones. Conversely, take not-quite-enough data and you can dismiss any real differences you do observe by saying they don’t meet the threshold of significance.
It’s this kind of nonsensical dichotomizing of results into significant/insignificant that’s led us to where we are now, dealing with the fallout of statistical results that are devoid of scientific impact, what Ziliak and McCloskey call “oomph.” https://books.google.com/books?id=JWLIRr_ROgAC&q=oomph#v=snippet&q=oomph&f=false
But, as I argue in the piece, how these otherwise smart people got tangled up in bad statistical logic makes total sense if you remember the eugenics context. Statistical significance is *incredibly* useful if what you’re trying to do is taxonomically separate people by race.
Arguing the absence of an association because your measurements failed to meet an arbitrary significance threshold is also very useful if you’re trying to deny the existence of, say, a correlation between IQ scores and environmental factors (contra to eugenicist purposes).
Any explanation of how we got in our current predicament must therefore include the history of how statistical thinking evolved over the last century, and a huge part of that story must involve eugenics.
Likewise, anyone looking to blithely dismiss these odious views of the founders of statistics as irrelevant now should ask themselves whether it’s truly possible to separate their approach to answering scientific questions from the *one main* application they designed it for.
Eugenics was not an incidental hobby in the careers of Galton, Pearson, or Fisher, nor an embarrassing phase they grew out of. In many ways it animated their entire intellectual projects.
The best argument, in fact, that we shouldn’t try to separate their statistical work from their eugenics advocacy is that they made absolutely no effort to do so. Pearson, for example, used his “statistical” journal Biometrika to publish unfiltered eugenics propaganda.
Statistics is a way of seeing the world, and you can’t become a leading light of the eugenics movement without seeing the world in some pretty twisted ways.
I’m grateful to @NautilusMag for giving me a platform and enough breathing room to do justice to this complex story, and to the many people who helped. Shout-out to Nate Joselson and his excellent website “Meditations on Inclusive Statistics”: https://njoselson.github.io/Motivation/ 
If you want to read more about the math, history, and philosophy of statistics — and how things started going off the rails around 1700, stay tuned for my book BERNOULLI'S FALLACY coming out next spring from Columbia University Press:
https://aubreyclayton.com/bernoulli 
@ColumbiaUP
I especially want to thank Professors @daniela_witten and @EKTBenn for speaking with me, and for their advocacy for changing the name of the Fisher lecture. To be clear: we likely disagree about the broad issues of statistical practice, viz. significance testing.
It’s my opinion that to purge statistics of its eugenicist origins we should burn both ideas completely to ashes, but I don’t mean to implicate anyone else in that cause. That being said, there are plenty of torches here if anyone wants one, and I’ll gladly give you a light. /end
You can follow @aubreyclayton.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: