I submitted data & syntax of our #COVID19 paper on student mental health ( https://psyarxiv.com/36xkp ) to a recent #reproducibility hackathon @reprohack, & received detailed feedback. Below a few insights.🧵

You can find more information on reprohacks here: https://n8cir.org.uk/news/reprohacks/
My motivation for participating was that this is the first dataset I collected on which I actually published (usually just for internal bachelor projects). I'm used to sharing #rstats code, but sharing data & codebooks was new, & getting feedback on this was super helpful.
The overall report was positive (most main outcomes could be reproduced), but the repository only received a 7/10 reproducibility rating, and my code & documentation left much to be desired. 5 main insights below.
Insight 1: I thought I write clean code because it looks pretty, but pretty ain't the same thing as clean. I would be glad if folks could post basic guidelines on code formatting here, clearly I have some standards to catch up on I am not aware of. Thanks.
Insight 2: I work in clean folder structure, e.g. post all pdfs into /figures. Challenging for reproducibility ofc. While easy to fix (comment out folders), I wonder what recommended guidelines are: I also don't want to have 100 data, code, figure etc files in 1 messy folder.
Insight 3: Some things in R are impossible to reproduce given a seed bc they depend on Weird Internal Workings (e.g. chaotic F-R algorithm for network models). This means I need to rely on other tools to make graphs reproducible, e.g. saving the actual graph layout (net$layout).
Insight 4: Teaching is hard. I used code to create separate files for my 16 students working w the data who are not (very) proficient in R. I thought I could create those & then just delete but some remnants remained in the code. Better to keep this completely separate next time.
Insight 5: I don't totally suck. Yay!

"The figures fully reproduced & were also saved in the correct folders. Top notch. Great to have session info in the supplement. Super nice that the questionnaire was included on the OSF. Good heading (including date that code was written)."
A list of general recommendations below. They are all excellent, and most them quite easy to implement.
Overall, brilliant experience which substantially increased my learning experience of sharing my 1st data. Thx to organizers (eg @RProppert) & repro-hackers (eg @lindanab1 @annloh) for putting time & effort into this.

And thx for *constructive* feedback (the good & the bad!).
Finally: how can we as researchers support these & similar events? How can we acknowledge the work you invest into improving our work?

/end 🧵.
You can follow @EikoFried.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: