

This initiative, led by @TaliaStroud @j_a_tucker + @AnnieFranco @chadkdj on the FB side will study social media's impact on democracy with unprecedented data access.
Many researchers inside FB, myself included, have fought for a long time to see something like this realized.
Get this: it's all pre-registered. That's key for the cred of effort, not just to show 'I didn't p-hack', but for the ORIGINAL motivation--as a credible, costly signal that inconvenient results have not simply been put into the 'file drawer' never to see the light of day.
Expect a lot of mixed survey-behavioral data research designs here, because running surveys allows you to get informed consent, which opens up a vast array of potential research designs not possible when just analyzing log data.
There's also a commitment to establish a process to allow for replication, key for publishing in top journals. This is AFAIK unprecedented and would be a huge innovation with implications for all industry-academic partnerships utilizing big, sensitive data.
And they've committed to publish every project period, even if not accepted in a journal.
One of the smartest innovations here, is to have FB staff analyze the data. While the URLs data I helped release has purpose & utility, the *scope of data* potentially on offer here is far broader in scope. I HAVE SO MUCH TO SAY ON THIS POINT KEEP READING
The most important data at Facebook is graph data: the connections
1. between people
2. between people & interests, preferences, group identities
3. between people & ideas
DUH. THAT'S HOW FB KEEPS PEOPLE ON THE SITE & SELLS ADS SO WELL
1. between people
2. between people & interests, preferences, group identities
3. between people & ideas
DUH. THAT'S HOW FB KEEPS PEOPLE ON THE SITE & SELLS ADS SO WELL
We all saw how hard it was since 2018 to share ordinary FB data w researchers in ways that complies w/ law on privacy. Just making aggregated data on exposure to URLs under DP was a herculean task, just scroll through the detail in the documentation: https://solomonmg.github.io/pdf/Facebook_DP_URLs_Dataset.pdf
But it's (near?) impossible to protect graph data under differential privacy & still allow researchers to answer their questions. This innovation cuts the Gordian knot & allows potentially any data at FB to be analyzed w/ help from top research scientists like @AnnieFranco, who..
like many at FB (& esp Core Data Science) who could be both a top DS an any co & has top-tier research chops (AND btw has a publication in Science on the file drawer problem)
Can we trust FB employees to keep everything legit? Well I trust many of them personally. But you don't have to, because they've built in a monitoring system to ensure scientific integrity, presumably operating under some kind of audit framework.