I'm chuffed! And nervous! The paper was completely unplanned and yet somehow inevitable -- developing and writing this was easily one of the most educational and fun experiences in my PhD.
Bear with me for the backstory of how that happened, or just read the paper 😉
Thread:
When I started my PhD, the post-replication-crisis meta-science field seemed to take two things for granted:
a) psych research is mostly hypothesis testing and
b) psychologists mostly test their hypotheses with significance tests (NHST).
Or at least I had internalised that view.
I based my own research on these assumptions and failed: @EmmaHendersonRR and I piloted a study that should become an inventory of Registered Reports, coding all hypotheses and statistics in RRs. But linking hypotheses to results and conclusions was so difficult that we gave up.
Of course we knew that research hypotheses aren't identical to statistical hypotheses! But I dare you to try it yourself with a few papers. Ppl run lots of tests; identifying how each informs the substantive hypothesis is VERY difficult (+hyp descriptions are often inconsistent).
At the same time, thanks to our work on equivalence testing, @lakens, @peder_isager & I were involved in dozens of conversations about how to determine a smallest effect size of interest (SESOI).
Bottom line: Psychologists have almost no information about what effects to expect.
All that was super confusing and also fascinating. The reform movement had been all about "fixing" hypothesis tests: Cut the wiggle room to prevent false positives. Yet psychologists couldn't even specify their hypotheses well enough to fit into that new scheme!
It reminded me of reform-critical sentiments: "How am I supposed to know my hypotheses before I've seen the data?" "Preregistration and RRs stifle creativity & will impede scientific progress!"
This all looked like a deep, not just technical, struggle to specify hypotheses.
Last year I visited @fidlerfm's lovely lab in Melbourne, presented my semi-coherent thoughts & got brilliant input from @hm_watkins, Kristian Camilleri & @tariqeden: eg, I learned that historians of sci have emphasised the exploratory underbelly of scientific practice for a while
@hm_watkins reinforced my view that we have been conflating the exploratory-confirmatory dimension with rigour in an unhelpful way.
Hypotheses are somehow supposed to come from exploration, but we never talk about how to explore well. The framing is "all bets are off, good luck".
Now let's get to the actual paper!
Why should hypothesis testers spend less time testing hypotheses? Because we think that *informative* hypothesis tests require a lot of background knowledge. Ignoring that leads to premature, arbitrary tests with arbitrary inferences.
If you can't specify a SESOI or prior for your hypothesis, you may not be ready to test a hypothesis.
If you don't know how to calculate the power of your test, you may not be ready to test a hypothesis.
If you struggle with the prereg, you may not be ready to test a hypothesis.
So what's that "background knowledge" you need for informative hypothesis tests?
We went with these -- not definitive, but hopefully useful -- categories:
concepts, measures, (causal) relationships b/w concepts, boundary conditions, auxiliary assumptions, statistical predictions.
The core message of our paper: Instead of jumping to premature tests, let's build that background knowledge.
How? We think it requires a bunch of "non-confirmatory" research methods that we've been neglecting: e.g., purely descriptive research or exploratory experimentation.
The point isn't that psychologists aren't doing any of this! It's that mainstream psych barely has a language for the goals & mechanisms of research activities that are not hypothesis tests.
But if our knowledge base depends on these activities, we need to teach and reward them.
We use the research programme on kama muta to showcase the role and importance of non-confirmatory research.
This bit was so much fun -- we spoke to Alan Fiske, Beate Seibt, and Thomas Schubert and dug into their extremely wholesome work. It's a super uplifting literature!
Summing up:
By fixing hypothesis tests, we've come to realise that the problem goes far deeper than hypothesis tests. Now let's shift our focus to strengthen the (empirical and theoretical) foundation of our discipline.
We hope our paper can start a conversation about that.
Final thoughts:
1) Of course our points aren't completely new! We found soooo many fitting quotes from old papers... we could have written an entire remix paper. But I do think this perspective has been missing from the reform movement's narrative.
2) We focused on the space between theory and hypothesis tests and didn't say much about theory building -- but of course these things are all intertwined in practice. I hope that our paper can be seen as complementary to the many great new perspectives on theory coming out.
3) Writing this was a fantastic teamwork experience @LeonidTiokhin @peder_isager @lakens ♄
It was hard and exhilarating. Each of us made the paper way better. Special shout to Leo for working extremely hard on this while also mentoring me through the first-author experience!
You can follow @annemscheel.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: