A common view is that hypothesis must precede data analysis for scientific progress.

I disagree, and history has good counterexamples, for instance Kepler's laws of planetary motion đŸȘ and Mendeleev periodic table of elements đŸ§Ș.

A small thread 1/6 https://twitter.com/NicoleBarbaro/status/1316792695713480704
Kepler noted that "the ratio between the period times of any 2 planets is precisely the ratio of the 3/2th power of the mean distance"
https://en.wikipedia.org/wiki/Kepler's_laws_of_planetary_motion#Third_law

His observations improved the model of Copernicus and were later explained by Newton's laws
http://ircamera.as.arizona.edu/Astr2016/lectures/kepler.htm

2/6
Mendeleev noted that "properties of the elements, and thus properties of light and heavy bodies formed by them, are in a periodic dependence on their atomic weight."
https://en.wikipedia.org/wiki/History_of_the_periodic_table

Earlier observations on elements ordering by Newland were ridiculed by chemists.

3/6
The periodic table is a manifestation of atomic-orbital properties, explained by quantum physics, and inner-nuclei structure which are explained by the standard model of particle physics, a pilar of modern laws of the universe.

4/6
History shows that finding unexplained patterns in data is core to scientific progress, and this cannot always fit in hypothesis-driven research.

Hypothesis-driven research is anchored in a scientific paradigm (a world view) and cannot bring paradigmatic revolutions.

5/6
Without hypothesis, the danger is data dredging, or "fishing expeditions" that undermine statistical control
https://en.wikipedia.org/wiki/Data_dredging

The solution in data-driven research is to derive clear predictions that can be confirmed or refuted on new data.

6/6
You can follow @GaelVaroquaux.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: